Preparation of acariogenic sugar substitutes

ABSTRACT

The invention relates to sucrose isomerases, to DNA sequences that code for sucrose isomerases, and to novel processes for the production of non-cariogenic sugars.

[0001] This is a Continuation in Part of U.S. application Ser. No.09/168,720, filed Oct. 9, 1998, which is a Divisional Application ofU.S. application Ser. No. 08/785,396, now U.S. Pat. No. 5,985,622, filedJan. 21, 1997, which is a Divisional Application of U.S. applicationSer. No. 08/374,155, now U.S. Pat. No. 5,786,140, filed Jan. 18, 1995,which claims priority of DE P 44 01 451.1, filed Jan. 19, 1994 and DE 4414 185.8, filed Apr. 22, 1994. All of these documents are expresslyincorporated by reference. All patents and other publications listedherein are expressly incorporated by reference.

[0002] The present invention relates to an improved process for thepreparation of non-cariogenic sugars, in particular trehalulose and/orpalatinose, using recombinant DNA technology.

[0003] The acariogenic sugar substitutes palatinose (isomaltulose) andtrehalulose are produced on a large scale from sucrose by an enzymaticrearrangement using immobilized bacterial cells (for example of thespecies Protaminobacter rubrum, Erwinia rhapontici, Serratiaplymuthica). This entails the α1-β2 glycosidic linkage existing betweenthe two monosaccharide units of the disaccharide sucrose beingisomerized to an α1-6 linkage in palatinose and to an α1-α1 linkage intrehalulose. This rearrangement of sucrose to give the two acariogenicdisaccharides takes place with catalysis by the bacterial enzyme sucroseisomerase, also called sucrose mutase. Depending on the organism used,this reaction results in a product mixture which, besides the desiredacariogenic disaccharides palatinose and trehalulose, also containscertain proportions of unwanted monosaccharides (glucose and/orfructose). These monosaccharide contents are a considerable industrialproblem because elaborate purification procedures (usually fractionalcrystallizations) are necessary to remove them.

[0004] For example EP-0 028 900 describes a method for producingpalantinose in a bioreactor by using immobilized sucrose isomerase,which was purified and immobilized from a raw extract by selectivelybinding to an anionic carrier matrix. The product composition obtainedby this method contains, apart from the desired acariogenicdisaccharides palantinose and trehalulose, 2.1-2.5% of the unwantedmonosaccharide fructose and 0.6-1.0% of the unwanted monosaccharideglucose.

[0005] Further, EP-0 483 755 describes a method for producingtrehalulose and palatinose, wherein a sucrose solution is contacted withat least one trehalulose-forming enzyme system of a trehalulose-formingmicroorganism at a temperature of 10-35° C., wherein a mostlytetrahalulose-containing product mixture is obtained, which, however,contains low amounts of the unwanted monosaccharides fructose andglucose. It could further be shown that the amount of unwantedmonosaccharides drastically increases by using higher incubationtemperatures preferred in large-scale technical methods and, inaddition, a rapid thermal inactivation of the enzyme preparationsoccurred.

[0006] One object on which the present invention is based was thus tosuppress as far as possible the formation of monosaccharides in theisomerization of sucrose to trehalulose and/or palatinose. Anotherobject on which the present invention is based was to provide organismswhich produce palatinose and/or trehalulose in a higher yield than doknown organisms.

[0007] To achieve these objects, recombinant DNA molecules, organismstransformed with recombinant DNA molecules, recombinant proteins and animproved process for the preparation of non-cariogenic sugars, inparticular of palatinose and/or trehalulose, are provided.

[0008] The invention relates to a DNA sequence which codes for a proteinwith a sucrose isomerase activity and comprises

[0009] (a) one of the nucleotide sequences shown in SEQ ID NO: 1, SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13, whereappropriate without the signal peptide-coding region,

[0010] (b) a nucleotide sequence corresponding to the sequences from (a)within the scope of the degeneracy of the genetic code, or

[0011] (c) a nucleotide sequence which hybridizes with the sequencesfrom (a) and/or (b).

[0012] In the context of the present invention, the term “protein with asucrose isomerase activity” is intended to embrace those proteins whichare able to isomerize sucrose to other disaccharides with conversion ofthe α1-β2 glycosidic linkage between glucose and fructose in sucroseinto another glycosidic linkage between two monosaccharide units, inparticular into an α1-6 linkage and/or an α1-α1 linkage. The term“protein with a sucrose isomerase activity” therefore particularlypreferably relates to a protein which is able to isomerize sucrose topalatinose and/or trehalulose. Moreover, the proportion of palatinoseand trehalulose in the total disaccharides formed by isomerization ofsucrose is preferably ≧2%, particularly preferably ≧20% and mostpreferably ≧50%.

[0013] The nucleotide sequence shown in SEQ ID NO: 1 codes for thecomplete sucrose isomerase from the microorganism Protaminobacter rubrum(CBS 547.77) including the signal peptide region. The nucleotidesequence shown in SEQ ID NO: 2 codes for the N-terminal section of thesucrose isomerase from the microorganism Erwinia rhapontici (NCPPB 1578)including the signal peptide region. The nucleotide sequence shown inSEQ ID NO: 3 codes for a section of the sucrose isomerase from themicroorganism SZ 62 (Enterobacter spec.).

[0014] The region which codes for the signal peptide in SEQ ID NO: 1extends from nucleotide 1-99. The region coding for the signal peptidein SEQ ID NO: 2 extends from nucleotide 1-108. The DNA sequenceaccording to the present invention also embraces the nucleotidesequences shown in SEQ ID NO: 1 and SEQ ID NO: 2 without the regioncoding for the signal peptide because the signal peptide is, as a rule,necessary only for correct localization of the mature protein in aparticular cell compartment (for example in the periplasmic spacebetween the outer and inner membrane, in the outer membrane or in theinner membrane) or for extracellular export, but not for the enzymaticactivity as such. The present invention thus furthermore embracessequences which also code for the mature protein (without signalpeptide) and are operatively linked to heterologous signal sequences, inparticular to prokaryotic signal sequences as described, for example, inE. L. Winnacker, Gene und Klone, Eine Einführung in die Gentechnologie,VCH-Verlagsgesellschaft Weinheim, Germany (1985), p. 256.

[0015] Nucleotide sequence SEQ ID NO: 9 codes for a variant of theisomerase from Protaminobacter rubrum. Nucleotide sequence SEQ ID NO: 11codes for the complete isomerase from the isolate SZ 62. Nucleotidesequence SEQ ID NO: 13 codes for most of the isomerase from themicroorganism MX-45 (FERM 11808 or FERM BP 3619).

[0016] Besides the nucleotide sequences shown in SEQ ID NO: 1, SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13, andnucleotide sequences corresponding to one of these sequences within thescope of the degeneracy of the genetic code, the present invention alsoembraces a DNA sequence which hybridizes with one of these sequences,provided that it codes for a protein which is able to isomerize sucrose.The term “hybridization” according to the present invention is used asin Sambrook et al. (Molecular Cloning. A Laboratory Manual, Cold SpringHarbor Laboratory Press (1989), 1.101-1.104). According to the presentinvention, hybridization is the word used when a positive hybridizationsignal is still observed after washing for 1 hour with 1×SSC and 0.1%SDS at 55° C., preferably at 62° C. and particularly preferably at 68°C., in particular for 1 hour in 0.2×SSC and 0.1% SDS at 55° C.,preferably at 62° C. and particularly preferably at 68° C. A nucleotidesequence which hybridizes under such washing conditions with one of thenucleotide sequences shown in SEQ ID NO: 1 or SEQ ID NO: 2, or with anucleotide sequence which corresponds thereto within the scope of thedegeneracy of the genetic code, is a nucleotide sequence according tothe invention.

[0017] The DNA sequence according to the invention preferably has

[0018] (a) one of the nucleotide sequences shown in SEQ ID NO: 1, SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13, whereappropriate without the signal peptide-coding region, or

[0019] (b) a nucleotide sequence which is at least 70% homologous withthe sequences from (a).

[0020] The DNA sequence according to the invention preferably also hasan at least 80% homologous nucleotide sequence to the conservedpart-regions of the nucleotide sequences shown in SEQ ID NO: 1, SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13. Theseconserved part-regions are, in particular, from nucleotide 139-186,nucleotide 256-312, nucleotide 328-360, nucleotide 379-420 and/ornucleotide 424-444 in the nucleotide sequence shown in SEQ ID NO: 1.

[0021] In a particularly preferred embodiment, the DNA sequenceaccording to the invention has an at least 80% homologous, in particularan at least 90% homologous, nucleotide sequence to the part-regions

[0022] (a) nucleotide 139-155 and/or

[0023] (b) nucleotide 625-644

[0024] of the nucleotide sequence shown in SEQ ID NO: 1.

[0025] Oligonucleotides derived from the above sequence regions haveproved suitable as primers for PCR amplification of isomerase fragmentsfrom the genomic DNA of a large number of tested microorganisms, forexample Protaminobacter rubrum (CBS 547.77), Erwinia rhapontici (NCPPB1578), isolate SZ 62 and Pseudomonas mesoacidophila MX-45 (FERM 11808).

[0026] Particularly preferably used for this purpose are the followingoligonucleotides, where appropriate in the form of mixtures, where thebases in parentheses can be present as alternatives:

[0027] Oligonucleotide I (17 nt): 5-TGGTGGAA(A,G)GA(G,A)GCTGT-3′ (SEQ IDNO:17)

[0028] Oligonucleotide II (20 nt): 5′-TCCCAGTTCAG(G,A)TCCGGCTG-3′ (SEQID NO:18)

[0029] Oligonucleotide I is derived from nucleotides 139-155 of SEQ IDNO: 1, and oligonucleotide II is derived from the sequence,complementary to nucleotides 625-644, of SEQ ID NO: 1. The differencesbetween the homologous part-regions of the DNA sequences according tothe invention and the sequences called oligonucleotide I andoligonucleotide II are preferably in each case not more than 2nucleotides and particularly preferably in each case not more than 1nucleotide.

[0030] In another particularly preferred embodiment of the presentinvention, the DNA sequence has an at least 80% homologous, inparticular an at least 90% homologous, nucleotide sequence to thepart-regions of

[0031] (c) nucleotide 995-1013 and/or

[0032] (d) nucleotide 1078-1094

[0033] of the nucleotide sequence shown in SEQ ID NO: 1.

[0034] Oligonucleotides derived from the above sequence regionshybridize with sucrose isomerase genes from the organismsProtaminobacter rubrum and Erwinia rhapontici. The followingoligonucleotides, where appropriate in the form of mixtures, areparticularly preferably used, where the bases indicated in parenthesesmay be present as alternatives:

[0035] Oligonucleotide III (19 nt): AAAGATGGCG(G,T)CGAAAAGA (SEQ IDNO:19)

[0036] Oligonucleotide IV (17 nt): 5′-TGGAATGCCTT(T,C)TTCTT-3′ (SEQ IDNO:20)

[0037] Oligonucleotide III is derived from nucleotides 995-1013 of SEQID NO: 1, and oligonucleotide IV is derived from nucleotides 1078-1094of SEQ ID NO: 1. The differences between the homologous part-regions ofthe DNA sequences according to the invention and the sequences calledoligonucleotide III and IV are preferably in each case not more than 2nucleotides and particularly preferably in each case not more than 1nucleotide.

[0038] Nucleotide sequences according to the invention can be obtainedin particular from microorganisms of the genera Protaminobacter,Erwinia, Serratia, Leuconostoc, Pseudomonas, Agrobacterium andKlebsiella. Specific examples of such microorganisms areProtoaminobacter rubrum (CBS 547.77), Erwinia rhapontici (NCPPB 1578),Serratia plymuthica (ATCC 15928), Serratia marcescens (NCIB 8285),Leuconostoc mesenteroides NRRL B-521f (ATCC 10830a), Pseudomonasmesoacidophila MX-45 (FERM 11808 or FERM BP 3619), Agrobacteriumradiobacter MX-232 (FERM 12397 or FERM BP 3620), Klebsiella subspeciesand Enterobacter species. The nucleotide sequences according to theinvention can be isolated in a simple manner from the genome of therelevant microorganisms, for example using oligonucleotides from one ormore of the conserved regions of SEQ ID NO. 1, SEQ ID NO. 2, SEQ ID NO.3, SEQ ID NO. 9, SEQ ID NO. 11 and SEQ ID NO. 13, by standard techniquesof amplification and/or hybridization, and be characterized. Thenucleotide sequences according to the invention are preferably obtainedby PCR amplification of the genomic DNA of the relevant organism usingoligonucleotides I and II. A part-fragment of the relevant sucroseisomerase gene is obtained in this way and can subsequently be used ashybridization probe for isolating the complete gene from a gene bank ofthe relevant microorganism. Alternatively, the nucleotide sequences canbe obtained by producing a gene bank from the particular organism anddirect screening of this gene bank with oligonucleotides I, II, IIIand/or IV.

[0039] The present invention further relates to a vector which containsat least one copy of a DNA sequence according to the invention. Thisvector can be any prokaryotic or eukaryotic vector on which the DNAsequence according to the invention is preferably under the control ofan expression signal (promoter, operator, enhancer, etc.). Examples ofprokaryotic vectors are chromosomal vectors such as, for example,bacteriophages (for example bacteriophage λ) and extrachromosomalvectors such as, for example, plasmids, with circular plasmid vectorsbeing particularly preferred. Suitable prokaryotic vectors aredescribed, for example, in Sambrook et al., supra, Chapters 1-4.

[0040] A particularly preferred example of a vector according to theinvention is the plasmid pHWS 88 which harbors a sucrose isomerase genefrom Protaminobacter rubrum (with the sequence shown in SEQ ID NO: 1 )under the control of the regulatable tac promoter. The plasmid pHWS 88was deposited on Dec. 16, 1993, at the Deutsche Sammlung vonMikroorganismen und Zellkulturen (DSM), Mascheroder Weg 1b, 38124Braunschweig, Germany, under the deposit number DSM 8824 in accordancewith the provisions of the Budapest Treaty.

[0041] Two further preferred examples of a vector according to theinvention are the plasmids pHWG314 and pHWG315, which harbor a sucroseisomerase gene from Protaminobacter rubrum and Pseudomonasmesoacidophila MX-45, respectively, under the control of a regulatablerhamnose promoter. This promoter is described in Wiese et al. (Arch.Microbiol. 176 (2001):187-196) and Wiese et al. (Appl. Microbiol. 55(2001), 750-757).

[0042] In another preferred embodiment of the present invention, thevector according to the invention is a plasmid which is present in thehost cell with a copy number of less than 10, particularly preferablywith a copy number of 1 to 2 copies per host cell. Examples of vectorsof this type are, on the one hand, chromosomal vectors such as, forexample, bacteriophage λ or F plasmids. F plasmids which contain thesucrose isomerase gene can be prepared, for example, by transformationof an E. coli strain which contains an F plasmid with a transposoncontaining the sucrose isomerase gene, and subsequent selection forrecombinant cells in which the transposon has integrated into the Fplasmid. One example of a recombinant transposon of this type is theplasmid pHWS 118 which contains the transposon Tn 1721 Tet and wasprepared by cloning a DNA fragment containing the sucrose isomerase genefrom the above-described plasmid PHWS 88 into the transposon pJOE 105(DSM 8825).

[0043] On the other hand, the vector according to the invention can alsobe a eukaryotic vector, for example a yeast vector (for example YIp,YEp, etc.) or a vector suitable for higher cells (for example a plasmidvector, viral vector, plant vector). Vectors of these types are familiarto the person skilled in the area of molecular biology so that detailsthereof need not be given here. Reference is made in this connection inparticular to Sambrook et al., supra, Chapter 16.

[0044] The present invention further relates to a cell which istransformed with a DNA sequence according to the invention or a vectoraccording to the invention. In one embodiment, this cell is aprokaryotic cell, preferably a Gram-negative prokaryotic cell,particularly preferably an enterobacterial cell. It is moreover possibleon the one hand to use a cell which contains no sucrose isomerase geneof its own, such as, for example, E. coli, but it is also possible, onthe other hand, to use cells which already contain such a gene on theirchromosome, for example the microorganisms mentioned above as source ofsucrose isomerase genes. Preferred examples of suitable prokaryoticcells are E. coli, Protaminobacter rubrum or Erwinia rhapontici cells.The transformation of prokaryotic cells with exogenous nucleic acidsequences is familiar to a person skilled in the area of molecularbiology (see, for example, Sambrook et al., supra, Chapter 1-4).

[0045] In another embodiment of the present invention, the cellaccording to the invention may, however, also be a eukaryotic cell suchas, for example, a fungal cell (for example yeast), an animal or a plantcell. Methods for the transformation or transfection of eukaryotic cellswith exogenous nucleic acid sequences are likewise familiar to theperson skilled in the area of molecular biology and need not beexplained here in detail (see, for example, Sambrook et al., Chapter16).

[0046] The invention also relates to a protein with a sucrose isomeraseactivity as defined above, which is encoded by a DNA sequence accordingto the invention. This protein preferably comprises

[0047] (a) one of the amino-acid sequences shown in SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 10, SEQ ID NO: 12 or SEQ ID NO: 14,where appropriate without the signal peptide region or

[0048] (b) an amino-acid sequence which is at least 80% homologous withthe sequences from (a).

[0049] The amino-acid sequence shown in SEQ ID NO: 4 comprises thecomplete sucrose isomerase from Protaminobacter rubrum. The signalpeptide extends from amino acid 1-33. The mature protein starts at aminoacid 34. The amino-acid sequence shown in SEQ ID NO: 5 comprises theN-terminal section of the sucrose isomerase from Erwinia rhapontici. Thesignal peptide extends from amino acid 1-36. The mature protein startsat amino acid 37. The amino-acid * sequence shown in SEQ ID NO: 6comprises a section of the sucrose isomerase from the microorganism SZ62. FIG. 1 compares the amino-acid sequences of the isomerases from P.rubrum, E. rhapontici and SZ 62.

[0050] Amino-acid sequence SEQ ID NO: 10 comprises a variant of theisomerase from P. rubrum. Amino-acid sequence SEQ ID NO: 12 comprisesthe complete isomerase from SZ 62. This enzyme has a high activity at37° C. and produces only a very small proportion of monosaccharides.Amino-acid sequence SEQ ID NO: 14 comprises a large part of theisomerase from MX-45. This enzyme produces about 85% trehalulose and 13%palatinose.

[0051] The protein according to the invention particularly preferablyhas an at least 90% homologous amino-acid sequence to conservedpart-regions from the amino-acid sequences shown in SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 10, SEQ ID NO: 12 or SEQ ID NO: 14,especially in part-regions from

[0052] (a) amino acid 51-149,

[0053] (b) amino acid 168-181,

[0054] (c) amino acid 199-250,

[0055] (d) amino acid 351-387 and/or

[0056] (e) amino acid 390-420

[0057] of the amino-acid sequence shown in SEQ ID NO: 4.

[0058] It is possible by means of the above mentioned DNA sequences,vectors, transformed cells and proteins to provide a sucrose isomeraseactivity in a simple manner without interfering additional enzymaticactivities. Preferably, the sucrose isomerase activity is >30 units/mgand more preferably >45 units/mg. Most preferably the sucrose isomeraseactivity lies in the range of from 45 units/mg to 150 units/mg.

[0059] It is possible for this purpose on the one hand to obtain thesucrose isomerase by recombinant DNA technology as constituent of anextract from the host organism or in isolated and purified form (forexample by expression in E. coli). This preferably purified and isolatedsucrose isomerase enzyme can be used, for example, in immobilized form,for the industrial production of acariogenic sugars such as, forexample, trehalulose and/or palatinose by reaction of sucrose in anenzyme reactor. The immobilization of enzymes is familiar to a skilledperson and need not be described in detail here.

[0060] On the other hand, the production of acariogenic sugars fromsucrose can also take place in a complete microorganism, preferably inimmobilized form. Cloning of the abovementioned sucrose isomerase geneinto an organism without or with reduced palatinose and/or trehalulosemetabolism (that is to say in an organism which is unable significantlyto degrade the abovementioned sugars) allows generation of a novelorganism which, owing to the introduction of exogenous DNA, is able toproduce acariogenic disaccharides with negligible formation ofmonosaccharides. Thus, suitable for introducing the sucrose isomerasegene is, on the one hand, an organism which is unable to utilizepalatinose and/or trehalulose (for example E. coli, bacillus, yeast)and, on the other hand, an organism which would in principle be able toutilize palatinose and/or trehalulose but has reduced palatinose and/ortrehalulose metabolism owing to undirected or directed mutation.

[0061] The term “reduced palatinose and/or trehalulose metabolism” meansfor the purpose of the present invention that a whole cell of therelevant organism produces, on utilization of sucrose as C source,acariogenic disaccharides but is able to utilize the latter to only asmall extent in metabolism, for example by degrading them tomonosaccharides. The organism preferably produces less than 2.5%,particularly preferably less than 2%, most preferably less than 1%, ofglucose plus fructose based-on the total of acariogenic disaccharidesand monosaccharide degradation products at a temperature of 15-65° C.,in particular of 25-55° C.

[0062] The present invention thus further relates to a cell whichcontains at least one DNA sequence coding for a protein with a sucroseisomerase activity, and has a reduced palatinose and/or trehalulosemetabolism as defined above. Preferably the cell according to theinvention exhibits such a sucrose isomerase expression rate that theamount of sucrose isomerase expressed in the cell is >10%,preferably >15% and particularly >25% of the total amount of proteins ofthe cell. A cell of this type produces larger proportions of thenon-cariogenic disaccharides trehalulose and/or palatinose and reducedamounts of the interfering byproducts glucose and fructose.

[0063] It is possible in one embodiment of the present invention toreduce the palatinose and/or trehalulose metabolism by partial orcomplete inhibition of the expression of invertase and/or palatinasegenes which are responsible for the intracellular degradation ofpalatinose and/or trehalulose. This inhibition of gene expression cantake place, for example, by site-directed mutagenesis and/or deletion ofthe relevant genes. A site-directed mutation of the palatinase geneshown in SEQ ID NO: 7 or of-the palatinose hydrolase gene shown in SEQID NO: 15 can take place, for example, by introduction of a vector whichis suitable for homologous chromosomal recombination and which harbors amutated palatinase gene, and selection for organisms in which such arecombination has taken place. The principle of selection by geneticrecombination is explained in E. L. Winnacker, Gene und Klone, EineEinführung in die Gentechnologie (1985), VCH-VerlagsgesellschaftWeinheim, Germany, pp. 320 et seq.

[0064] It is furthermore possible to obtain organisms according to theinvention with reduced palatinose and/or trehalulose metabolism bynon-specific mutagenesis from suitable starting organisms and selectionfor palatinase-deficient mutants. One example of a palatinase-deficientmutant of this type is the Protaminobacter rubrum strain SZZ 13 whichwas deposited on Mar. 29, 1994, at the Deutsche Sammlung vonMikroorganismen und Zellkulturen (DSM), Mascheroder Weg 1b, 38124Braunschweig, Germany, under deposit number DSM 9121 in accordance withthe provisions of the Budapest Treaty. This microorganism was preparedby non-specific mutagenesis of P. rubrum wild-type cells withN-methyl-N′-nitro-N-nitrosoguanidine and is distinguished in that it isno longer able to cleave the non-cariogenic sugars trehalulose andpalatinose to glucose and fructose. Selection for such mutants can takeplace, for example, by using MacConkey palatinose media or minimal saltmedia with palatinose or glucose as sole C source. The mutants which arewhite on MacConkey palatinose medium (MacConkey Agar Base from DifcoLaboratories, Detroit, Mich., USA (40 g/l) and 20 g/l palatinose) orwhich grow on minimal salt media with glucose as sole C source but noton corresponding media with palatinose as sole C source are identifiedas palatinase-deficient mutants.

[0065] The present invention furthermore relates to a method forisolating nucleic acid sequences which code for a protein with a sucroseisomerase activity, wherein a gene bank from a donor organism whichcontains a DNA sequence coding for a protein with a sucrose isomeraseactivity is set up in a suitable host organism, the clones of the genebank are examined, and the clones which contain a nucleic acid codingfor a protein with sucrose isomerase activity are isolated. The nucleicacids which are isolated in this way and code for sucrose isomerase canin turn be used for introduction into cells as described above in orderto provide novel producer organisms of acariogenic sugars.

[0066] In this method, the chosen host organism is preferably anorganism which has no functional genes of its own for palatinosemetabolism, in particular no functional palatinase and/or invertasegenes. A preferred host organism is E. coli. To facilitatecharacterization of palatinose-producing clones it is possible onexamination of the clones in the gene bank for sucrose-cleaving clonesand the DNA sequences which are contained therein and originate from thedonor organism to be isolated and transformed in an E. coli strain whichdoes not utilize galactose and which is used as screening strain for theclones in the gene bank.

[0067] On the other hand, the examination of the clones in the gene bankfor DNA sequences which code for a protein with a sucrose isomeraseactivity can also take place using nucleic acid probes derived from thesequences SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ IDNO: 11 and SEQ ID NO: 13 which code for the sucrose isomerase genes fromProtaminobacter rubrum, Erwinia rhapontici and the isolate SZ 62. A DNAfragment obtained by PCR reaction with oligonucleotides I and II asprimers, or the oligonucleotides III and/or IV, are particularlypreferably used as probes.

[0068] The present invention further relates to a process for theproduction of non-cariogenic sugars, in particular trehalulose and/orpalatinose, which comprises using for the production of the sugars

[0069] (a) a protein with sucrose isomerase activity in isolated form,

[0070] (b) an organism which is transformed with a DNA sequence whichcodes for protein with sucrose isomerase activity, or with a vectorwhich contains at least one copy of this DNA sequence,

[0071] (c) an organism which contains at least one DNA sequence codingfor a protein with a sucrose isomerase activity, and has a reducedpalatinose and/or trehalulose metabolism, and/or

[0072] (d) an extract from such a cell or from such an organism.

[0073] The process is generally carried out by contacting the protein,the organism or the extract in a suitable medium with sucrose underconditions such that the sucrose is at least partly converted by thesucrose isomerase into acariogenic disaccharides. Subsequently, theacariogenic disaccharides are obtained from the medium or the organismand purified in a known manner.

[0074] In a preferred embodiment of this process, the organism, theprotein or the extract is used in immobilized form. Proteins (in pureform or in extracts) are preferably immobilized by coupling of reactiveside groups (for example NH₂ groups) to a suitable carrier.Immobilization of cells takes place, for example, in a sodiumalginate/calcium chloride solution. A review of suitable methods forimmobilizing cells and proteins is given, for example, in I. Chibata(Immobilized Enzymes, John Wiley and Sons, New York, London, 1978).

[0075] It is possible on use of a cell transformed with the sucroseisomerase gene to increase the rate of production of acariogenic sugarsby comparison with known organisms by increasing the number of genecopies in the cell and/or by increasing the expression rate in acombination with strong promoters. It is furthermore possible bytransformation of a cell which is unable or able to only a limitedextent to utilize acariogenic sugars with the sucrose isomerase gene toproduce a transformed cell with whose aid it is possible to obtainacariogenic sugars, in particular palatinose and/or trehalulose, withoutor with fewer byproducts.

[0076] On use of a microorganism with reduced palatinose and/ortrehalulose metabolism, which already contains a functional sucroseisomerase gene, transformation with an exogenous sucrose isomerase geneis not essential but may be carried out to improve the yields.

[0077] Finally, the present invention also relates to a DNA sequencewhich codes for a protein with palatinase or palatinose hydrolaseactivity and comprises

[0078] (a) one of the nucleotide sequences shown in SEQ ID NO: 7 or SEQID NO: 15,

[0079] (b) a nucleotide sequence which corresponds to the sequence from(a) within the scope of the degeneracy of the genetic code or

[0080] (c) a nucleotide sequence which hybridizes with the sequencesfrom (a) and/or (b).

[0081] The invention further relates to a vector which contains at leastone copy of the above mentioned DNA sequence and to a cell which istransformed with a DNA sequence or a vector as mentioned above. Theinvention likewise embraces a protein with palatinase activity which isencoded by a DNA sequence as indicated above and which preferably hasone of the amino-acid sequences shown in SEQ ID NO: 8 or SEQ ID NO: 16.

[0082] The palatinase from P. rubrum shown in SEQ ID NO: 8 differs fromknown sucrose-cleaving enzymes in that it cleaves the sucrose isomerswhich are not cleaved by known enzymes, in particular palatinose.

[0083] The amino acid sequence shown in SEQ ID NO: 16 comprises apalatinose hydrolase from MX-45, which cleaves palatinose to formfructose and glucose. The gene-coding for this enzyme is shown in SEQ IDNO: 15 and is located in the genome of MX-45 on the 5′ side of theisomerase gene shown in SEQ ID NO: 13.

[0084] The invention is further described by the following sequencelistings and figures:

[0085] SEQ ID NO: 1 shows the nucleotide sequence of the gene coding forthe sucrose isomerase from Protaminobacter rubrum. The sequence codingfor the signal peptide terminates at nucleotide No. 99.

[0086] SEQ ID NO: 2 shows the N-terminal section of the nucleotidesequence of the gene coding for the sucrose isomerase of Erwiniarhapontici. The sequence coding for the signal peptide terminates at thenucleotide with No. 108.

[0087] SEQ ID NO: 3 shows a section of the nucleotide sequence of thegene coding for the sucrose isomerase from the isolate SZ 62.

[0088] SEQ ID NO: 4 shows the amino-acid sequence of the sucroseisomerase from Protaminobacter rubrum.

[0089] SEQ ID NO: 5 shows the N-terminal section of the amino-acidsequence of the sucrose isomerase from Erwinia rhapontici.

[0090] SEQ ID NO: 6 shows a section of the amino-acid sequence of thesucrose isomerase from the isolate SZ 62.

[0091] SEQ ID NO: 7 shows the nucleotide sequence for the palatinasegene from Protaminobacter rubrum.

[0092] SEQ ID NO: 8 shows the amino-acid sequence of the palatinase fromProtaminobacter rubrum.

[0093] SEQ ID NO: 9 shows the nucleotide sequence of a variant of thesucrose isomerase gene from P. rubrum.

[0094] SEQ ID NO: 10 shows the corresponding amino-acid sequence.

[0095] SEQ ID NO: 11 shows the complete nucleotide sequence of thesucrose isomerase gene from SZ 62.

[0096] SEQ ID NO: 12 shows the corresponding amino-acid sequence.

[0097] SEQ ID NO: 13 shows most of the sucrose isomerase gene fromPseudomonas mesoacidophila (MX-45).

[0098] SEQ ID NO: 14 shows the corresponding amino acid sequence.

[0099] SEQ ID NO: 15 shows the palatinose hydrolase gene fromPseudomonas mesoacidophila (MX-45).

[0100] SEQ ID NO: 16 shows the corresponding amino-acid sequence.

[0101]FIG. 1 shows a comparison of the amino-acid sequences of thesucrose isomerases from Protaminobacter rubrum, Erwinia rhapontici andthe isolate SZ 62,

[0102]FIG. 2 shows the cloning diagram for the preparation of therecombinant plasmid pHWS 118 which contains the sucrose isomerase geneon the transposon Tn 1721,

[0103]FIG. 3 shows the diagram for the preparation of E. colitransconjugants which contain the sucrose isomerase gene of a F plasmidand

[0104]FIG. 4 shows a comparison between the saccharides produced by P.rubrum wild-type cells and cells of the P. rubrum mutant SZZ 13.

[0105]FIG. 5 shows plasmid pHWG314.

[0106]FIG. 6 shows plasmid pHWG315.

[0107] The following examples serve to illustrate the present invention.

EXAMPLE 1

[0108] Isolation of the Sucrose Isomerase Gene from Protaminobacterrubrum

[0109] Complete DNA from the organism Protaminobacter rubrum (CBS574.77) was partially digested with Sau3A I. Collections of fragmentswith a size of about 10 kBp were obtained from the resulting fragmentmixture by elution after fractionation by gel electrophoresis and wereligated into a derivative, which had been opened with BamHI, of thelambda EMBL4 vector derivative λ RESII (J. Altenbuchner, Gene 123(1993), 63-68). A gene bank was produced by transfection of E. coli andtransformation of the phages into plasmids according to the abovereference. Screening of the kanamycin-resistant colonies in this genebank was carried out with the radiolabeled oligonucleotide S214 whichwas derived from the sequence of the N-terminus of the mature isomeraseby hybridization: S214: 5′-ATCCCGAAGTGGTGGAAGGAGGC-3′ (SEQ ID NO:21)           T  A  A        A  A

[0110] Subsequently, the plasmid DNA was isolated from the colonies witha positive reaction after appropriate cultivation. After a restrictionmap had been drawn up, suitable subfragments were sequenced from aplasmid pKAT 01 obtained in this way, and thus the complete nucleotidesequence, which is shown in SEQ ID NO: 1, of the DNA coding forisomerase was obtained. The amino-acid sequence derived therefromcorresponds completely to the peptide sequence of the mature isomeraseobtained by sequencing (Edmann degradation). A cleavage site for SacI islocated in the non-coding 3′ region of this isomerase gene, and acleavage site for HindIII is located in the non-coding 5′ region. Thismakes it possible to subclone the intact isomerase gene into the vectorpUCBM 21 (derivative of the vector pUC 12, Boehringer Mannheim GmbH,Mannheim, Germany) which had previously been cleaved with the saidenzymes. The resulting plasmid was called pHWS 34.2 and confers on theE. coli cells harboring it the ability to synthesize sucrose isomerase.

[0111] A variant of the sucrose isomerase gene from P. rubrum has thenucleotide sequence shown in SEQ ID NO: 9.

EXAMPLE 2

[0112] Cloning and Expression of the Sucrose Isomerase from P. rubrum inE. coli

[0113] 1. Preparation of the Plasmid pHWS88

[0114] The non-coding 5′ region of the sucrose isomerase gene wasdeleted from the plasmid pHWS 34.2, using an oligonucleotide S434 withthe sequence 5′-CGGMTTCTTATGCCCCGTCAAGGA-3′ (SEQ ID NO: 22), withsimultaneous introduction of an EcoRI cleavage site (GAATTC). Theisomerase gene derivative obtained in this way was treated with BstE II,the protruding BstE II end was digested off with S1 nuclease andsubsequently digestion with EcoRI was carried out. The isomerase genetreated in this way was cloned into the vector pBTacI (BoehringerMannheim GmbH, Mannheim, Germany) which had been pretreated with EcoRIand SmaI. The resulting vector PHWS 88 (DSM 8824) contains the modifiedisomerase gene with a preceding EcoRI restriction site in front of theATG start codon, and the 3′ region of the isomerase gene up to theS1-truncated BstE II cleavage site. On induction with IPTG, this vectorconfers on the cells harboring this plasmid the ability to produceisomerase and resistance to ampicillin (50 to 100 μg/ml). Preferablyused for producing isomerase are E. coli host cells which overproducethe lac repressor.

[0115] 2. Preparation of the Plasmid pHWS118::Tn1721Tet

[0116] The gene cassette for the sucrose mutase was incorporated into atransposon.

[0117] This took place by cloning an SphI/HindIII DNA fragment from theplasmid pHWS88, which harbors the sucrose mutase gene under the controlof the tac promoter, into the plasmid pJOE105 on which the transposon Tn1721 is located. The plasmid pJOE105 was deposited on Dec. 16, 1993, atthe DSM under the deposit number DSM 8825 in accordance with theprovisions of the Budapest Treaty. The resulting plasmid pHWS118, onwhich the sucrose mutase gene is under the control of the regulatabletac promoter, was used to transform a E. coli strain containing an F′plasmid. FIG. 2 shows the cloning diagram for the preparation of pHWS118 from pHWS88 and pJOE 105.

[0118]E. coli transconjugants containing the sucrose mutase gene wereprepared as described in the diagram in FIG. 3. For this purpose,firstly the F′-harboring E. coli strain CSH36 (J. H. Miller, Experimentsin Molecular Genetics, Cold Spring Harbor Laboratory (1972), p. 18),which carries the Lac+ phenotype mediated by the F′ plasmid, was crossedwith the E. coli strain JM108 which is resistant to nalidixic acid(Sambrook et al., supra, p. A9-A13). Selection on minimal medium towhich lactose, proline and nalidixic acid were added resulted in anF′-Lac-harboring transconjugant. This was additionally transformed withthe Iq plasmid FDX500 (Brinkmann et al., Gene 85 (1989), 109-114) inorder to permit control of the sucrose mutase gene by the tac promoter.

[0119] The transconjugant prepared in this way was transformed with thetransposon plasmid pHWS118 harboring the sucrose mutase gene. Forselection of transconjugants, crossing into the streptomycin-resistantE. coli strain HB101 (Boyer and Roulland-Dussoix, J. Mol. Biol 41(1969), 459472) was carried out. Transfer of the tetracycline resistancemediated by the transposon was possible only after transposition of themodified Tn1721Tet from the plasmid pHWS118, which is not capable ofconjugation or mobilization, to the F′ plasmid which is capable ofconjugation. Transmission of the F′ plasmid with the modified transposonin HB101 was selected on LB plates containing streptomycin andtetracycline, and retested on ampicillin and nalidixic acid plates.

[0120] 3. Expression of the Sucrose Isomerase in E. coli

[0121] Examination of the enzyme production by such F′ plasmid-harboringE. coli cells showed that it was possible to produce sucrose mutaseprotein. F′ plasmid-containing HB101 cells which harbored no additionalLac repressor plasmid (for example K1/1 or K1/10) produced sucrosemutase protein in identical amounts with and without the inducerisopropyl β-D-thiogalactoside (IPTG). The productivities of threetransconjugants K1/1, K1/10 and K1/4 are shown in Table 1.

[0122] It was possible to observe normal growth of the E. coli cellsduring production of sucrose mutase protein.

[0123] Introduction of the sucrose mutase gene into the F

plasmid in the presence of the repressor-encoded plasmid pFDX500 (seetransconjugants K1/4) made it possible to control enzyme production withthe inducer IPTG. Whereas no enzymatic activity was measured withoutIPTG, production of about 1.6 U/mg sucrose mutase protein was obtainableafter induction for 4 hours.

[0124] No adverse effect on cell growth was observable. Theplasmid-harboring E. coli cells reached a density of about 3 OD₆₀₀ afterinduction for 4 hours.

[0125] Up to 1.6 U/mg sucrose mutase activity were measured intransformed E. coli. The synthetic performance is comparable to that ofP. rubrum. Analysis of the produced enzyme by SDS gel electrophoresisprovides no evidence of inactive protein aggregates. The band of thesucrose mutase protein was only weakly visible with Coomassie stainingand was detectable clearly only in a Western blot. It was possible tocorrelate the strength of the protein band and the measured enzymaticactivity in the production of sucrose mutase in E. coli.

EXAMPLE 3

[0126] Isolation of the sucrose isomerase gene from Erwinia rhapontici

[0127] A gene bank was produced by restriction cleavage of the completeDNA from Erwinia rhapontici (NCPPB 1578) in the same way as described inExample 1.

[0128] Using the primer mixtures 5′-TGGTGGAAAGAAGCTGT-3′ (SEQ ID NO:23)            G  G

[0129] and 5′-TCCCAGTTCAGGTCCGGCTG-3′ (SEQ ID NO:24),               A

[0130] PCR amplification resulted in a DNA fragment with whose aid it ispossible to identify colonies containing the mutase gene byhybridization.

[0131] In this way, a positive clone pSST2023 which contains a fragment,1305 nucleotides long, of the Erwinia isomerase gene was found. Thenucleotide sequence of this fragment is depicted in SEQ ID NO: 2.

[0132] Sequence comparison with the Protaminobacter gene reveals anidentity of 77.7% and a similarity of 78% for the complete gene sectionincluding the signal peptide region, and an identity of 83.4% and asimilarity of 90.3% at the amino-acid level.

[0133] The sequence differences are mainly concentrated in the signalpeptide region. For this reason, only the enzyme-encoding regionresponsible for the actual mutase activity, without the signal peptide,should be considered for comparison. From these viewpoints, the identityor similarity at the nucleotide level emerges as 79%. Comparison of theamino-acid sequences (FIG. 1) in this section shows 87.9% identicalamino acids. Of 398 amino acids (this corresponds to 71% of the completeenzyme) in the Erwinia mutase, 349 are the same as in Protaminobacter.25 of 48 exchanged amino acids show strong similarity so that theoverall similarity at the AA level emerges as 94%. The M exchanges aremainly concentrated in the region between amino acid 141 and 198. Infront of this region there is a sequence of 56 conserved amino acids.Other sections also exhibit particularly high conservation (see FIG. 1).

[0134] These data show that, for the section cloned and sequenced todate, overall there is very extensive conservation of the two mutasesfrom Erwinia and Protaminobacter.

[0135] Identity of the Cloned Mutase Gene from Erwinia

[0136] The probe chosen for a rehybridization experiment with genomicErwinia DNA was the SspI/EcoRI fragment, which is about 500 bp in size,from pSST2023. This fragment was used, after digoxigenin labeling, forhybridization with Erwinia DNA with high stringency (68° C.). CompleteErwinia DNA cut with SspI/EcoRI showed a clear hybridization signal withthe expected size of about 500 bp. Erwinia DNA cut only with SspI showeda hybridization signal of about 2 kb.

[0137] It was possible to verify by the successful rehybridization ofpSST2023 with genomic Erwinia DNA that the mutase region cloned intopSST2023 originates from Erwinia rhapontici.

[0138] Cloning of the C-Terminal Part-Fragment of the Erwinia Mutase

[0139] The N-terminal part-fragment of the Erwinia mutase gene which hasbeen cloned to date has a size of 1.3 kb and has the nucleotide sequenceshown in SEQ ID NO: 2. Since it can be assumed that the complete Erwiniagene is virtually identical in size to the known Protaminobacter gene(1.8 kb), a section of about 500 bp is missing from the C-terminalregion of the Erwinia gene.

[0140] The SspI fragment which is about 2 kb in size from the completeErwinia DNA was selected for cloning of the Erwinia C-terminus. In aSouthern blot, this fragment provides a clear signal with adigoxigenin-labeled DNA probe from pSST2023. This 2 kb SspI fragmentoverlaps by about 500 bp at the 3′ end with the region already cloned inpSST2023. Its size ought to be sufficient for complete cloning of themissing gene section of about 500 bp. The digoxigenin-labeled fragmentprobe SspI/EcoRI from pSST2023 is suitable for identifying clones whichare sought.

EXAMPLE 4

[0141] Preparation of a Protaminobacter Palatinase-Deficient Mutant

[0142] Cells of Protoaminobacter rubrum (CBS 547, 77) were mutagenizedwith N-methyl-N′-nitro-N-nitroso-guanidine by the method of Adelberg etal. (Biochem. Biophys. Research Commun. 18 (1965), 788) as modified byMiller, J., (Experiments in Molecular Genetics, Cold Spring HarborLaboratory, 125-179 (1972)). Palatinase-deficient mutants were selectedusing MacConkey palatinose medium (MacConkey Agar Base (DifcoLaboratories, Detroit, Mich., USA), 40 g/l with the addition of 20 g/lpalatinose, sterilized by filtration, 25 mg/l kanamycin) and minimalsalt media (10.5 g of K₂HPO₄, 4.5 g of KH₂PO₄, 1 g of (NH₄)₂SO₄, 0.5 gof sodium citrate 2 H₂O, 0.1 g of MgSO₄.7H₂O), 1 mg of thiamine, 2 g ofpalatinose or glucose, 25 mg of kanamycin and 15 g of agar per liter, pH7.2). Mutants of P. rubrum which are white on MacConkey palatinosemedium or grow on minimal salt medium with glucose in contrast to thesame medium with palatinose are identified as palatinase-deficientmutants. The enzyme activity of cleaving palatinose to glucose andfructose (palatinase activity) cannot, in contrast to the wild-type, bedetected in cell extracts from these mutants. On cultivation of thesecells in minimal salt medium with 0.2% sucrose as sole C source thereis, in contrast to the wild-type cells in which palatinose can bedetected only transiently in the time from 4 to 11 hours after startingthe culture, a detectable continuous accumulation of palatinose(isomaltulose). Overnight cultures in the same medium contain nopalatinose in the case of the wild-type cells but contain >0.08%palatinose in the case of the mutant SZZ 13 (DSM 9121) prepared in thisway (see FIG. 4).

EXAMPLE 5

[0143] Immobilization of Microorganism Cells

[0144] Cells are rinsed off a subculture of the appropriate strain using10 ml of a sterile nutrient substrate composed of 8 kg of concentratedjuice from a sugar factory (dry matter content=65%), 2 kg of corn steepliquor, 0.1 kg of (NH₄)₂HPO₄ and 89.9 kg of distilled water, pH 7.2.This suspension is used as inoculum for preculture in 1 l flaskscontaining 200 ml of nutrient solution of the above composition inshaking machines. After an incubation time of 30 hours at 29° C., 10flasks (total contents 2 l) are used to inoculate 18 l of nutrientsolution~of the above composition in a 30 l small fermenter, andfermentation is carried out at 29° C. and a stirring speed of 350 rpmintroducing 20 l of air per minute.

[0145] After organism counts above 5×10⁹ organisms per ml are reached,the fermentation is stopped and the cells are harvested from thefermenter solution by centrifugation. The cells are then suspended in a2% strength sodium alginate, solution and immobilized by dropwiseaddition of the suspension to a 2% strength calcium chloride solution.The resulting immobilizate beads are washed with water and can be storedat +4° C. for several weeks.

[0146] Cells of the palatinase-deficient mutant SZZ 13 (DSM 9121) showbetter catalytic properties in respect of their product composition thando comparable cells from the known microorganisms Protaminobacter rubrum(CBS 547.77) and Erwinia rhapontici (NCPPB 1578).

[0147] Whole cells and crude extracts of SZZ 13, and an immobilizate ofSZZ 13 in calcium alginate prepared as above, were evaluated in respectof product composition in an activity assay. Before the actual activityassay, the immobilizate was swollen in 0.1 mol/l potassium phosphatebuffer, pH 6.5.

[0148] The activity measurements at 25° C. revealed that no fructose andglucose were found with the mutant SZZ 13, while with P. rubrumwild-type cells 2.6% fructose and glucose (based on the total of mono-and disaccharides) were found in whole cells and 12.0% were found in thecrude extract. In the case of E. rhapontici, 4% glucose and fructosewere found in whole cells, and 41 % in the crude extract.

EXAMPLE 6

[0149] Isolation of the Sucrose Isomerase Gene from Other Microorganisms

[0150] Partial digestion of genomic DNA from the isolate SZ62(Enterobacter spec.), the organism Pseudomonas mesoacidophila (MX-45) orfrom another microorganism and insertion of the resulting fragments intosuitable E. coli vectors and transformation result in a gene bank whoseclones contain genomic sections between 2 and 15 kb of the donororganism.

[0151] Those E. coli cells which harbor these plasmids and which displaya red coloration of the colony are selected by plating on McConkeypalatinose medium. The plasmid DNA contained in these cells istransferred into an E. coli mutant which is unable to grow on galactoseas sole C source (for example ED 8654, Sambrook et al., supra, pagesA9-A13).

[0152] This transformed cell line is able to identify palatinoseproducers in the gene bank which has been prepared as described abovefrom DNA of the donor organism.

[0153] To identify the palatinose-producing clones which are sought, thecells of the gene bank are isolated and cultured on minimal salt mediacontaining galactose and sucrose. After replica plating of the colonieson plates containing the same medium, the cells are killed by exposureto toluene vapor. Subsequently, cells of the screening strain are spreadas lawn in minimal salt soft agar without added C source over thecolonies of the gene bank and incubated. Significant growth of the cellsof the screening strain appears only at the location of cells in thegene bank which have produced palatinose. The isomerase content emergeson testing the cells of the replica control.

[0154] These E. coli clones identified in this way are unable to grow onpalatinose as sole C source in the medium, show no ability to cleavesucrose in a test on whole cells or on cell extracts, but on cultivationunder these conditions and without addition of sucrose to the mediumproduce palatinose.

[0155] Alternatively, isomerase clones can also be identified using aPCR fragment prepared by the procedure of Example 3.

[0156] Use of plasmid DNA from the E. coli clones identified in this wayas probes for hybridization on filters with immobilized DNA from thedonor organism allows the gene regions which harbor isomerase genes tobe detected and specifically made available.

[0157] A clone which contains the nucleotide sequence shown in SEQ IDNO: 3, with the amino-acid sequence which, is derived therefrom andshown in SEQ ID NO: 6, was identified in this way. In the same way anisomerase clone from DNA of the bacterial strain Pseudomonasmesoacidophila MX-45 (FERM 11808) was found.

[0158] The complete nucleotide sequence and amino-acid sequence of thesucrose isomerase from SZ 62 are depicted in SEQ ID NO: 11 and 12. Alarge part of the nucleotide sequence and amino-acid sequence of thesucrose isomerase from MX-45 are depicted in SEQ ID NO: 13 and 14.

EXAMPLE 7

[0159] Cloning of a Palatinase Gene

[0160] The Protaminobacter rubrum gene bank prepared in Example 1 wasscreened with the radiolabeled oligonucleotide mixture S433 which wasderived from the sequence of the N-terminus of the isolated palatinaseand had the sequence CA(G,A)TT(C,T)GG(T,C)TA(C,T)GG-3′. (SEQ ID NO:25)

[0161] A positive clone was found, and a plasmid named pKAT 203 wasisolated therefrom.

[0162]E. coli cells which harbor the plasmid pKAT 203 are able tometabolize palatinose. The cleavage of palatinose to glucose andfructose which is detectable in the activity assay suggests that thereis a “palatinase”.

[0163] It is possible by sequencing pKAT203 DNA with the oligonucleotideS433 as primer to obtain a DNA sequence from which it was possible toread off, after translation into amino-acid sequence data, theN-terminal amino acids known to us. An open reading frame was obtainedby a subsequent sequencing step.

[0164] Determination of the Sequence of the “Palatinase” Gene

[0165] For further sequencing of the “palatinase” gene, part-fragmentsfrom the plasmid pKAT 203 were selected on the basis of the restrictionmap and subcloned in the M13 phage system, and a sequencing of thesingle-stranded phage DNA was carried out with the universal primer5′-GTTTTCCCAGTCACGAC-3′ (SEQ ID NO: 26).

[0166] Combination of the resulting DNA sequence data for the individualfragments taking account of overlapping regions allows a continuousreading frame of 1360 base pairs to be determined for the “palatinase”(SEQ ID NO: 7).

[0167] Translation of this DNA sequence into amino-acid data reveals aprotein with 453 amino acids (SEQ ID NO: 8) and a molecular weight,which can be deduced therefrom, of about 50,000 Da. This is consistentwith the finding that a protein fraction which had a band at about48,000 Da in the SDS gel was obtainable by concentration of the“palatinase” activity. In the native gel, the palatinose-cleavingactivity was attributable to a band with a size of about 150,000 Da.

[0168] Comparisons of Homology with Other Known Proteins

[0169] Comparison of the amino-acid sequence derivable from the DNAsequence with data stored in a gene bank (SwissProt) revealed a homologywith melibiase from E. coli (MeIA) (in two parts: identity 32%).

EXAMPLE 8

[0170] Cloning of a Palatinose Hydrolase Gene from P. mesoacidophilaMX-45

[0171] A gene with the nucleotide sequence shown in SEQ ID NO: 15 wasisolated from the gene bank prepared from the microorganism P.mesoacidophila MX-45 in Example 6. This gene codes for a protein withthe amino-acid sequence shown in SEQ ID NO: 16. The protein is apalatinose hydrolase which catalyzes the cleavage of palatinose to formfructose and glucose.

EXAMPLE 9

[0172] Cloning and Expression of the Sucrose Isomerase fromProtaminobacter rubrum and Pseudomonas mesoacidophila MX-45 in E.coli

[0173] 1. Preparation of the Plasmids pHWG314 and pHWG315

[0174] The plasmids were prepared by inserting the gene modules into thevector pJOE2702 (Wiese et al. (2001) supra) digested with NdeI/HindIII(314) and NdeI/BamHI (315), respectively. The plasmids carry the entiresequence coding for each mutase. FIGS. 5 and 6 show the restriction mapsof both plasmids. E. coli JM109 was used as host.

[0175] 2. Expression of the Sucrose Isomerase in E.coli and Isolation

[0176] The production of the enzyme was induced by adding 0.2% rhamnoseto the medium. The cells were harvested via centrifugation andsonification. Standard conditions were applied, as described in theother examples. For the purification from E. coli a chromatography ofthe raw extract was performed using a cation exchange chromatographycolumn, e.g. a MonoS column (Pharmacia). The material was loaded on thecolumn in the presence of 10 mM Ca-acetate, pH 6.5. The elution wascarried out with a NaCl gradient of from 0-100 mM. Then the fractionswere tested for their protein content. All results are contained inTable 2.

[0177] 3. Isolation of the Sucrose Isomerase from Protaminobacter rubrumand Pseudomonas mesoacidophila MX-45

[0178] The enzymes from the (wild-type) strains Protaminobacter rubrumand Pseudomonas mesoacidophila were not purified. Purification accordingto the above protocol was successful only regarding the enzymes producedin E. coli.

[0179] After cultivation in a suitable complete medium containing 2%sucrose the cells of the wild-type strains were harvested at 30° C. anddecomposed. Then the enzymatic activity and the protein content weredetermined.

[0180] Surprisingly, the recombinant sucrose isomerase can be separatedby a one-step procedure from homologous proteins in the E. coli extractexhibiting high yield and purity. Apparently the recombinant protein hasa different charge composition compared to the isomerase from nativeorganisms.

[0181] A comparison of the results shown in Table 2 shows that with thehelp of the recombinant E.coli strains significantly higher sucroseisomerase yields can be obtained than with the wild-type strainsPseudomonas mesoacidophila MX-45 and Protominobacter rubrum. The amountof recombinant sucrose isomerase expressed in the E.coli strains isabout 15.6 and 28.9%, respectively, of the total amount of proteins inthe cell. Contrary thereto, the amount of sucrose isomerase formed inthe wild-type strains Pseudomonas mesoacidophila MX-45 andProtominobacter rubrum was so small that it could not be detected byusing conventional detection methods. Furthermore, the activity of therecombinant sucrose isomerase is about 10-times higher than the activityof the sucrose isomerase formed in the respective wild-type strains.Consequently, the protein isolated from the wild-type strains must havebeen inactivated to a great extent when isolated from the wild-typestrains. TABLE 1 Sucrose mutase activity in E. coli HB101 (F′::Tn1721[Mutase]) U/mg mutase after 4 hours U/mg mutase after 4 hours Strainwithout induction induction with 50 μM IPTG K1/1 1.0 1.2 K1/10 0.9 1.1K1/4 0 1.6

[0182] TABLE 2 The results obtained under items 3 and 4 of example 9 aresummarized in the following table: Expression in E. coli Culture mg U inpure U in (plasmid) Wild-type strain Gene Vol. (ml) OD₆₀₀ U ml⁻¹ ml⁻¹ mgml⁻¹ U (total) mg⁻¹ % cell protein P. mesoacido- mutB 400 4.0 4.3 0.489.0 1.720 n.m. u.m. phila MX-45 pHWG315 mutB 400 3.8 45.2 0.46 98.618.080 340 28.9% P. rubrum smuA 400 3.0 1.6 0.36 4.4 640 n.m. u.m.pHWG314 smuA 400 2.8 16.5 0.34 48.5 6.600 310 15.6% P. mesoacido- mutB3000 27.2 28.4 3.26 8.7 85.200 n.m. u.m. phila MX-45 pHWG315 mutB 300026.6 308.5 3.19 96.7 925.500 n.m. n.m. P. rubrum smuA 3000 29.5 14.93.54 4.2 44.700 n.m. u.m. pHWG314 smuA 3000 26.1 149.6 3.13 47.8 448.800n.m. n.m.

[0183]

1 26 1890 base pairs nucleic acid single linear DNA (genomic) 1ATGCCCCGTC AAGGATTGAA AACTGCACTA GCGATTTTTC TAACCACATC ATTATGCATC 60TCATGCCAGC AAGCCTTCGG TACGCAACAA CCCTTGCTTA ACGAAAAGAG TATCGAACAG 120TCGAAAACCA TACCTAAATG GTGGAAGGAG GCTGTTTTTT ATCAGGTGTA TCCGCGCTCC 180TTTAAAGACA CCAACGGAGA TGGCATCGGG GATATTAACG GCATCATAGA AAAATTAGAC 240TATCTAAAAG CCTTGGGGAT TGATGCCATT TGGATCAACC CACATTATGA TTCTCCGAAC 300ACGGATAATG GTTACGATAT ACGTGATTAT CGAAAAATCA TGAAAGAATA TGGCACGATG 360GAGGATTTTG ACCGCCTGAT TTCTGAAATG AAAAAACGGA ATATGCGGTT GATGATTGAT 420GTGGTCATCA ACCACACCAG CGATCAAAAC GAATGGTTTG TTAAAAGTAA AAGCAGTAAG 480GATAATCCTT ATCGCGGCTA TTATTTCTGG AAAGATGCTA AAGAAGGGCA GGCGCCTAAT 540AATTACCCTT CATTCTTTGG TGGCTCGGCG TGGCAAAAAG ATGAAAAGAC CAATCAATAC 600TACCTGCACT ATTTTGCTAA ACAACAGCCT GACCTAAACT GGGATAATCC CAAAGTCCGT 660CAAGATCTTT ATGCAATGTT ACGTTTCTGG TTAGATAAAG GCGTGTCTGG TTTACGTTTT 720GATACGGTAG CGACCTACTC AAAAATTCCG GATTTCCCAA ATCTCACCCA ACAACAGCTG 780AAGAATTTTG CAGCGGAGTA TACCAAGGGC CCTAATATTC ATCGTTACGT CAATGAAATG 840AATAAAGAGG TCTTGTCTCA TTACGACATT GCGACTGCCG GTGAAATCTT TGGCGTACCC 900TTGGATCAAT CGATAAAGTT CTTCGATCGC CGCCGTGATG AGCTGAACAT TGCATTTACC 960TTTGACTTAA TCAGACTCGA TCGAGACTCT GATCAAAGAT GGCGTCGAAA AGATTGGAAA 1020TTGTCGCAAT TCCGGCAGAT CATCGATAAC GTTGACCGTA CTGCAGGAGA ATATGGTTGG 1080AATGCCTTCT TCTTGGATAA CCACGACAAT CCGCGCGCTG TCTCGCACTT TGGCGATGAT 1140GATCGCCCAC AATGGCGTGA GCCATCGGCT AAAGCGCTTG CAACCTTGAC GCTGACTCAA 1200CGAGCAACAC CTTTTATTTA TCAAGGTTCA GAATTGGGCA TGACCAATTA CCCGTTTAAA 1260GCTATTGATG AATTCGATGA TATTGAGGTG AAAGGTTTTT GGCATGACTA CGTTGAGACA 1320GGAAAGGTCA AAGCCGACGA GTTCTTGCAA AATGTACGCC TGACGAGCAG GGATAACAGC 1380CGGACGCCGT TCCAATGGGA TGGGAGCAAA AATGCAGGAT TCACGAGCGG AAAACCTTGG 1440TTCAAGGTCA ACCCAAACTA CCAGGAAATC AATGCAGTAA GTCAAGTCAC ACAACCCGAC 1500TCAGTATTTA ACTATTATCG TCAGTTGATC AAGATAAGGC ATGACATCCC GGCACTGACC 1560TATGGTACAT ACACCGATTT GGATCCTGCA AATGATTCGG TCTACGCCTA TACACGCAGC 1620CTTGGGGCGG AAAAATATCT TGTTGTTGTT AACTTCAAGG AGCAAATGAT GAGATATAAA 1680TTACCGGATA ATTTATCCAT TGAGAAAGTG ATTATAGACA GCAACAGCAA AAACGTGGTG 1740AAAAAGAATG ATTCATTACT CGAGCTAAAA CCATGGCAGT CAGGGGTTTA TAAAACTAAA 1800TCAATAAATC TCATAGTCAC GCCAAATAAT GTAAATATAT TGAAACTATT AAAACCGGCA 1860TTTTATGCCG GTTTTTTTAG CGCAAAATAG 1890 1305 base pairs nucleic acidsingle linear DNA (genomic) misc_RNA 28 /note= “N = Unknown” misc_RNA85..87 /note= “N = Unknown” 2 ATGTCCTCTC AAGGATTGAA AACGGCTNTCGCTATTTTTC TTGCAACCAC TTTTTCTGCC 60 ACATCCTATC AGGCCTGCAG TGCCNNNCCAGATACCGCCC CCTCACTCAC CGTTCAGCAA 120 TCAAATGCCC TGCCCACATG GTGGAAGCAGGCTGTTTTTT ATCAGGTATA TCCACGCTCA 180 TTTAAAGATA CGAATGGGGA TGGCATTGGGGATTTAAACG GTATTATTGA GAATTTAGAC 240 TATCTGAAGA AACTGGGTAT TGATGCGATTTGGATCAATC CACATTACGA TTCGCCGAAT 300 ACGGATAATG GTTATGACAT CCGGGATTACCGTAAGATAA TGAAAGAATA CGGTACGATG 360 GAAGACTTTG ACCGTCTTAT TTCAGAAATGAAGAAACGCA ATATGCGTTT GATGATTGAT 420 ATTGTTATCA ACCACACCAG CGATCAGCATGCCTGGTTTG TTCAGAGCAA ATCGGGTAAG 480 AACAACCCCT ACAGGGACTA TTACTTCTGGCGTGACGGTA AGGATGGCCA TGCCCCCAAT 540 AACTATCCCT CCTTCTTCGG TGGCTCAGCCTGGGAAAAAG ACGATAAATC AGGCCAGTAT 600 TACCTCCATT ACTTTGCCAA ACAGCAACCCGACCTCAACT GGGACAATCC CAAAGTCCGT 660 CAAGACCTGT ATGACATGCT CCGCTTCTGGTTAGATAAAG GCGTTTCTGG TTTACGCTTT 720 GATACCGTTG CCACCTACTC GAAAATCCCGAACTTCCCTG ACCTTAGCCA ACAGCAGTTA 780 AAAAATTTCG CCGAGGAATA TACTAAAGGTCCTAAAATTC ACGACTACGT GAATGAAATG 840 AACAGAGAAG TATTATCCCA CTATGATATCGCCACTGCGG GGGAAATATT TGGGGTTCCT 900 CTGGATAAAT CGATTAAGTT TTTCGATCGCCGTAGAAATG AATTAAATAT AGCGTTTACG 960 TTTGATCTGA TCAGGCTCGA TCGTGATGCTGATGAAAGAT GGCGGCGAAA AGACTGGACC 1020 CTTTCGCAGT TCCGAAAAAT TGTCGATAAGGTTGACCAAA CGGCAGGAGA GTATGGGTGG 1080 AATGCCTTTT TCTTAGACAA TCACGACAATCCCCGCGCGG TTTCTCACTT TGGTGATGAT 1140 CGACCACAAT GGCGCGAGCA TGCGGCGAAAGCACTGGCAA CATTGACGCT GACCCAGCGT 1200 GCAACGCCGT TTATCTATCA GGGTTCAGAACTCGGTATGA CCAATTATCC CTTTAAAAAA 1260 ATCGATGATT TCGATGATGT AGAGGTGAAAGGTTTTTGGC AAGAC 1305 471 base pairs nucleic acid single linear DNA(genomic) 3 GTTTTTTATC AGATCTATCC TCGCTCATTT AAAGACACCA ATGATGATGGCATTGGCGAT 60 ATTCGCGGTA TTATTGAAAA GCTGGACTAT CTGAAATCGC TCGGTATTGACGCTATCTGG 120 ATCAATCCCC ATTACGACTC TCCGAACACC GATAACGGCT ATGACATCAGTAATTATCGT 180 CAGATAATGA AAGAGTATGG CACAATGGAG GATTTTGATA GCCTTGTTGCCGAAATGAAA 240 AAACGAAATA TGCGCTTAAT GATCGACGTG GTCATTAACC ATACCAGTGATCAACACCCG 300 TGGTTTATTC AGAGTAAAAG CGATAAAAAC AACCCTTATC GTGACTATTATTTCTGGCGT 360 GACGGAAAAG ATAATCAGCC ACCTAATAAT TACCCCTCAT TTTTCGGCGGCTCGGCATGG 420 CAAAAAGATG CAAAGTCAGG ACAGTACTAT TTACACTATT TTGCCAGACA G471 629 amino acids amino acid single linear peptide 4 Met Pro Arg GlnGly Leu Lys Thr Ala Leu Ala Ile Phe Leu Thr Thr 1 5 10 15 Ser Leu CysIle Ser Cys Gln Gln Ala Phe Gly Thr Gln Gln Pro Leu 20 25 30 Leu Asn GluLys Ser Ile Glu Gln Ser Lys Thr Ile Pro Lys Trp Trp 35 40 45 Lys Glu AlaVal Phe Tyr Gln Val Tyr Pro Arg Ser Phe Lys Asp Thr 50 55 60 Asn Gly AspGly Ile Gly Asp Ile Asn Gly Ile Ile Glu Lys Leu Asp 65 70 75 80 Tyr LeuLys Ala Leu Gly Ile Asp Ala Ile Trp Ile Asn Pro His Tyr 85 90 95 Asp SerPro Asn Thr Asp Asn Gly Tyr Asp Ile Arg Asp Tyr Arg Lys 100 105 110 IleMet Lys Glu Tyr Gly Thr Met Glu Asp Phe Asp Arg Leu Ile Ser 115 120 125Glu Met Lys Lys Arg Asn Met Arg Leu Met Ile Asp Val Val Ile Asn 130 135140 His Thr Ser Asp Gln Asn Glu Trp Phe Val Lys Ser Lys Ser Ser Lys 145150 155 160 Asp Asn Pro Tyr Arg Gly Tyr Tyr Phe Trp Lys Asp Ala Lys GluGly 165 170 175 Gln Ala Pro Asn Asn Tyr Pro Ser Phe Phe Gly Gly Ser AlaTrp Gln 180 185 190 Lys Asp Glu Lys Thr Asn Gln Tyr Tyr Leu His Tyr PheAla Lys Gln 195 200 205 Gln Pro Asp Leu Asn Trp Asp Asn Pro Lys Val ArgGln Asp Leu Tyr 210 215 220 Ala Met Leu Arg Phe Trp Leu Asp Lys Gly ValSer Gly Leu Arg Phe 225 230 235 240 Asp Thr Val Ala Thr Tyr Ser Lys IlePro Asp Phe Pro Asn Leu Thr 245 250 255 Gln Gln Gln Leu Lys Asn Phe AlaAla Glu Tyr Thr Lys Gly Pro Asn 260 265 270 Ile His Arg Tyr Val Asn GluMet Asn Lys Glu Val Leu Ser His Tyr 275 280 285 Asp Ile Ala Thr Ala GlyGlu Ile Phe Gly Val Pro Leu Asp Gln Ser 290 295 300 Ile Lys Phe Phe AspArg Arg Arg Asp Glu Leu Asn Ile Ala Phe Thr 305 310 315 320 Phe Asp LeuIle Arg Leu Asp Arg Asp Ser Asp Gln Arg Trp Arg Arg 325 330 335 Lys AspTrp Lys Leu Ser Gln Phe Arg Gln Ile Ile Asp Asn Val Asp 340 345 350 ArgThr Ala Gly Glu Tyr Gly Trp Asn Ala Phe Phe Leu Asp Asn His 355 360 365Asp Asn Pro Arg Ala Val Ser His Phe Gly Asp Asp Asp Arg Pro Gln 370 375380 Trp Arg Glu Pro Ser Ala Lys Ala Leu Ala Thr Leu Thr Leu Thr Gln 385390 395 400 Arg Ala Thr Pro Phe Ile Tyr Gln Gly Ser Glu Leu Gly Met ThrAsn 405 410 415 Tyr Pro Phe Lys Ala Ile Asp Glu Phe Asp Asp Ile Glu ValLys Gly 420 425 430 Phe Trp His Asp Tyr Val Glu Thr Gly Lys Val Lys AlaAsp Glu Phe 435 440 445 Leu Gln Asn Val Arg Leu Thr Ser Arg Asp Asn SerArg Thr Pro Phe 450 455 460 Gln Trp Asp Gly Ser Lys Asn Ala Gly Phe ThrSer Gly Lys Pro Trp 465 470 475 480 Phe Lys Val Asn Pro Asn Tyr Gln GluIle Asn Ala Val Ser Gln Val 485 490 495 Thr Gln Pro Asp Ser Val Phe AsnTyr Tyr Arg Gln Leu Ile Lys Ile 500 505 510 Arg His Asp Ile Pro Ala LeuThr Tyr Gly Thr Tyr Thr Asp Leu Asp 515 520 525 Pro Ala Asn Asp Ser ValTyr Ala Tyr Thr Arg Ser Leu Gly Ala Glu 530 535 540 Lys Tyr Leu Val ValVal Asn Phe Lys Glu Gln Met Met Arg Tyr Lys 545 550 555 560 Leu Pro AspAsn Leu Ser Ile Glu Lys Val Ile Ile Asp Ser Asn Ser 565 570 575 Lys AsnVal Val Lys Lys Asn Asp Ser Leu Leu Glu Leu Lys Pro Trp 580 585 590 GlnSer Gly Val Tyr Lys Thr Lys Ser Ile Asn Leu Ile Val Thr Pro 595 600 605Asn Asn Val Asn Ile Leu Lys Leu Leu Lys Pro Ala Phe Tyr Ala Gly 610 615620 Phe Phe Ser Ala Lys 625 435 amino acids amino acid single linearpeptide Peptide 10 /note= “X = Unknown” Peptide 29 /note= “X = Unknown”5 Met Ser Ser Gln Gly Leu Lys Thr Ala Xaa Ala Ile Phe Leu Ala Thr 1 5 1015 Thr Phe Ser Ala Thr Ser Tyr Gln Ala Cys Ser Ala Xaa Pro Asp Thr 20 2530 Ala Pro Ser Leu Thr Val Gln Gln Ser Asn Ala Leu Pro Thr Trp Trp 35 4045 Lys Gln Ala Val Phe Tyr Gln Val Tyr Pro Arg Ser Phe Lys Asp Thr 50 5560 Asn Gly Asp Gly Ile Gly Asp Leu Asn Gly Ile Ile Glu Asn Leu Asp 65 7075 80 Tyr Leu Lys Lys Leu Gly Ile Asp Ala Ile Trp Ile Asn Pro His Tyr 8590 95 Asp Ser Pro Asn Thr Asp Asn Gly Tyr Asp Ile Arg Asp Tyr Arg Lys100 105 110 Ile Met Lys Glu Tyr Gly Thr Met Glu Asp Phe Asp Arg Leu IleSer 115 120 125 Glu Met Lys Lys Arg Asn Met Arg Leu Met Ile Asp Ile ValIle Asn 130 135 140 His Thr Ser Asp Gln His Ala Trp Phe Val Gln Ser LysSer Gly Lys 145 150 155 160 Asn Asn Pro Tyr Arg Asp Tyr Tyr Phe Trp ArgAsp Gly Lys Asp Gly 165 170 175 His Ala Pro Asn Asn Tyr Pro Ser Phe PheGly Gly Ser Ala Trp Glu 180 185 190 Lys Asp Asp Lys Ser Gly Gln Tyr TyrLeu His Tyr Phe Ala Lys Gln 195 200 205 Gln Pro Asp Leu Asn Trp Asp AsnPro Lys Val Arg Gln Asp Leu Tyr 210 215 220 Asp Met Leu Arg Phe Trp LeuAsp Lys Gly Val Ser Gly Leu Arg Phe 225 230 235 240 Asp Thr Val Ala ThrTyr Ser Lys Ile Pro Asn Phe Pro Asp Leu Ser 245 250 255 Gln Gln Gln LeuLys Asn Phe Ala Glu Glu Tyr Thr Lys Gly Pro Lys 260 265 270 Ile His AspTyr Val Asn Glu Met Asn Arg Glu Val Leu Ser His Tyr 275 280 285 Asp IleAla Thr Ala Gly Glu Ile Phe Gly Val Pro Leu Asp Lys Ser 290 295 300 IleLys Phe Phe Asp Arg Arg Arg Asn Glu Leu Asn Ile Ala Phe Thr 305 310 315320 Phe Asp Leu Ile Arg Leu Asp Arg Asp Ala Asp Glu Arg Trp Arg Arg 325330 335 Lys Asp Trp Thr Leu Ser Gln Phe Arg Lys Ile Val Asp Lys Val Asp340 345 350 Gln Thr Ala Gly Glu Tyr Gly Trp Asn Ala Phe Phe Leu Asp AsnHis 355 360 365 Asp Asn Pro Arg Ala Val Ser His Phe Gly Asp Asp Arg ProGln Trp 370 375 380 Arg Glu His Ala Ala Lys Ala Leu Ala Thr Leu Thr LeuThr Gln Arg 385 390 395 400 Ala Thr Pro Phe Ile Tyr Gln Gly Ser Glu LeuGly Met Thr Asn Tyr 405 410 415 Pro Phe Lys Lys Ile Asp Asp Phe Asp AspVal Glu Val Lys Gly Phe 420 425 430 Trp Gln Asp 435 157 amino acidsamino acid single linear peptide 6 Val Phe Tyr Gln Ile Tyr Pro Arg SerPhe Lys Asp Thr Asn Asp Asp 1 5 10 15 Gly Ile Gly Asp Ile Arg Gly IleIle Glu Lys Leu Asp Tyr Leu Lys 20 25 30 Ser Leu Gly Ile Asp Ala Ile TrpIle Asn Pro His Tyr Asp Ser Pro 35 40 45 Asn Thr Asp Asn Gly Tyr Asp IleSer Asn Tyr Arg Gln Ile Met Lys 50 55 60 Glu Tyr Gly Thr Met Glu Asp PheAsp Ser Leu Val Ala Glu Met Lys 65 70 75 80 Lys Arg Asn Met Arg Leu MetIle Asp Val Val Ile Asn His Thr Ser 85 90 95 Asp Gln His Pro Trp Phe IleGln Ser Lys Ser Asp Lys Asn Asn Pro 100 105 110 Tyr Arg Asp Tyr Tyr PheTrp Arg Asp Gly Lys Asp Asn Gln Pro Pro 115 120 125 Asn Asn Tyr Pro SerPhe Phe Gly Gly Ser Ala Trp Gln Lys Asp Ala 130 135 140 Lys Ser Gly GlnTyr Tyr Leu His Tyr Phe Ala Arg Gln 145 150 155 1362 base pairs nucleicacid single linear DNA (genomic) 7 ATGGCTACAA AAATCGTTTT AGTGGGCGCAGGCAGCGCGC AATTCGGCTA CGGCACCCTG 60 GGCGATATCT TCCAGAGCAA GACGCTGTACGGCAGTGAAA TTGTGCTGCA TGACATCAAC 120 CCAACCTCGC TGGCCGTGAC CGAGAAAACCGCCCGTGACT TCCTGGCTGC GGAAGATCTG 180 CCGTTTATCG TCAGCGCCAC CACCGATCGCAAAACCGCGC TGAGCGGAGC GGAGTTCGTG 240 ATTATCTCCA TTGAAGTGGG CGACCGCTTTGCCCTGTGGG ATCTCGACTG GCAGATCCCG 300 CAACAGTATG GCATTCAGCA GGTGTATGGTGAAAACGGTG GCCCTGGCGG GCTGTTCCAC 360 TCGCTGCGCA TCATTCCACC GATCCTCGACATCTGCGCCG ACGTGGCGGA CATTTGCCCG 420 AACGCCTGGG TATTCAACTA CTCGAACCCGATGAGCCGCA TTTGCACCAC CGTGCATCGC 480 CGTTTCCCGC AGCTCAACTT TGTCGGCATGTGCCATGAAA TCGCCTCACT TGAGCGTTAT 540 CTGCCAGAAA TGCTCGGCAC CTCCTTCGACAATCTCACTC TGCGCGCTGC CGGGCTGAAC 600 CACTTCAGCG TGTTGCTGGA GGCCAGCTATAAAGACAGCG GAAAAGACGC TTACGCCGAC 660 GTACGCGCCA AGGCACCGGA CTATTTCTCCCGTCTGCCGG CGTACAGCGA TATTCTGGCT 720 TACACCCGCA ATCACGGCAA ATTGGTGGAGACAGAAGGCA GCACCGAACG CGATGCGCTG 780 GGCGGCAAAG ACAGCGCCTA TCCGTGGGCGGACCGCACGC TGTTCAAAGA GATCCTGGAG 840 AAGTTTCACC ATTTGCCGAT CACCGGCGACAGCCACTTTG GCGAGTACAT CCGTTGGGCC 900 AGCGAAGTCA GCGATCACCG CGGTATCCTCGATTTCTACA CCTTCTACCG CAACTATCTG 960 GGGCATGTGC AGCCAAAAAT CGAACTGAAGCTGAAAGAAC GCGTGGTGCC GATCATGGAA 1020 GGGATCCTCA CCGATTCCGG TTATGAAGAGTCTGCGGTCA ACATTCCGAA CCAGGGATTT 1080 ATCAAGCAAC TGCCGGCGTT TATTGCCGTCGAAGTCCCGG CGATTATCGA CCGCAAGGGC 1140 GTGCACGGCA TCAAGGTCGA TATGCCTGCGGGCATCGGTG GCCTGTTGAG CAACCAGATT 1200 GCGATTCACG ATCTGACCGC CGACGCAGTGATTGAAGGCT CGCGCGACCT GGTTATCCAG 1260 GCGCTGCTGG TGGACTCGGT CAACGATAAATGCCGCGCGA TACCGGAACT GGTGGACGTG 1320 ATGATCTCAC GCCAGGGGCC GTGGCTCGATTACCTGAAAT AA 1362 453 amino acids amino acid single linear peptide 8Met Ala Thr Lys Ile Val Leu Val Gly Ala Gly Ser Ala Gln Phe Gly 1 5 1015 Tyr Gly Thr Leu Gly Asp Ile Phe Gln Ser Lys Thr Leu Tyr Gly Ser 20 2530 Glu Ile Val Leu His Asp Ile Asn Pro Thr Ser Leu Ala Val Thr Glu 35 4045 Lys Thr Ala Arg Asp Phe Leu Ala Ala Glu Asp Leu Pro Phe Ile Val 50 5560 Ser Ala Thr Thr Asp Arg Lys Thr Ala Leu Ser Gly Ala Glu Phe Val 65 7075 80 Ile Ile Ser Ile Glu Val Gly Asp Arg Phe Ala Leu Trp Asp Leu Asp 8590 95 Trp Gln Ile Pro Gln Gln Tyr Gly Ile Gln Gln Val Tyr Gly Glu Asn100 105 110 Gly Gly Pro Gly Gly Leu Phe His Ser Leu Arg Ile Ile Pro ProIle 115 120 125 Leu Asp Ile Cys Ala Asp Val Ala Asp Ile Cys Pro Asn AlaTrp Val 130 135 140 Phe Asn Tyr Ser Asn Pro Met Ser Arg Ile Cys Thr ThrVal His Arg 145 150 155 160 Arg Phe Pro Gln Leu Asn Phe Val Gly Met CysHis Glu Ile Ala Ser 165 170 175 Leu Glu Arg Tyr Leu Pro Glu Met Leu GlyThr Ser Phe Asp Asn Leu 180 185 190 Thr Leu Arg Ala Ala Gly Leu Asn HisPhe Ser Val Leu Leu Glu Ala 195 200 205 Ser Tyr Lys Asp Ser Gly Lys AspAla Tyr Ala Asp Val Arg Ala Lys 210 215 220 Ala Pro Asp Tyr Phe Ser ArgLeu Pro Gly Tyr Ser Asp Ile Leu Ala 225 230 235 240 Tyr Thr Arg Asn HisGly Lys Leu Val Glu Thr Glu Gly Ser Thr Glu 245 250 255 Arg Asp Ala LeuGly Gly Lys Asp Ser Ala Tyr Pro Trp Ala Asp Arg 260 265 270 Thr Leu PheLys Glu Ile Leu Glu Lys Phe His His Leu Pro Ile Thr 275 280 285 Gly AspSer His Phe Gly Glu Tyr Ile Arg Trp Ala Ser Glu Val Ser 290 295 300 AspHis Arg Gly Ile Leu Asp Phe Tyr Thr Phe Tyr Arg Asn Tyr Leu 305 310 315320 Gly His Val Gln Pro Lys Ile Glu Leu Lys Leu Lys Glu Arg Val Val 325330 335 Pro Ile Met Glu Gly Ile Leu Thr Asp Ser Gly Tyr Glu Glu Ser Ala340 345 350 Val Asn Ile Pro Asn Gln Gly Phe Ile Lys Gln Leu Pro Ala PheIle 355 360 365 Ala Val Glu Val Pro Ala Ile Ile Asp Arg Lys Gly Val HisGly Ile 370 375 380 Lys Val Asp Met Pro Ala Gly Ile Gly Gly Leu Leu SerAsn Gln Ile 385 390 395 400 Ala Ile His Asp Leu Thr Ala Asp Ala Val IleGlu Gly Ser Arg Asp 405 410 415 Leu Val Ile Gln Ala Leu Leu Val Asp SerVal Asn Asp Lys Cys Arg 420 425 430 Ala Ile Pro Glu Leu Val Asp Val MetIle Ser Arg Gln Gly Pro Trp 435 440 445 Leu Asp Tyr Leu Lys 450 1803base pairs nucleic acid single linear DNA (genomic) 9 ATGCCCCGTCAAGGATTGAA AACTGCACTA GCGATTTTTC TAACCACATC ATTATGCATC 60 TCATGCCAGCAAGCCTTCGG TACGCAACAA CCCTTGCTTA ACGAAAAGAG TATCGAACAG 120 TCGAAAACCATACCTAAATG GTGGAAGGAG GCTGTTTTTT ATCAGGTGTA TCCGCGCTCC 180 TTTAAAGACACCAACGGAGA TGGCATCGGG GATATTAACG GCATCATAGA AAAATTAGAC 240 TATCTAAAAGCCTTGGGGAT TGATGCCATT TGGATCAACC CACATTATGA TTCTCCGAAC 300 ACGGATAATGGTTACGATAT ACGTGATTAT CGAAAAATCA TGAAAGAATA TGGCACGATG 360 GAGGATTTTGACCGCCTGAT TTCTGAAATG AAAAAACGGA ATATGCGGTT GATGATTGAT 420 GTGGTCATCAACCACACCAG CGATCAAAAC GAATGGTTTG TTAAAAGTAA AAGCAGTAAG 480 GATAATCCTTATCGCGGCTA TTATTTCTGG AAAGATGCTA AAGAAGGGCA GGCGCCTAAT 540 AATTACCCTTCATTCTTTGG TGGCTCGGCG TGGCAAAAAG ATGAAAAGAC CAATCAATAC 600 TACCTGCACTATTTTGCTAA ACAACAGCCT GACCTAAACT GGGATAATCC CAAAGTCCGT 660 CAAGATCTTTATGCAATGTT ACGTTTCTGG TTAGATAAAG GCGTGTCTGG TTTACGTTTT 720 GATACGGTAGCGACCTACTC AAAAATTCCG GATTTCCCAA ATCTCACCCA ACAACAGCTG 780 AAGAATTTTGCAGCGGAGTA TACCAAGGGC CCTAATATTC ATCGTTACGT CAATGAAATG 840 AATAAAGAGGTCTTGTCTCA TTACGACATT GCGACTGCCG GTGAAATCTT TGGCGTACCC 900 TTGGATCAATCGATAAAGTT CTTCGATCGC CGCCGTGATG AGCTGAACAT TGCATTTACC 960 TTTGACTTAATCAGACTCGA TCGAGACTCT GATCAAAGAT GGCGTCGAAA AGATTGGAAA 1020 TTGTCGCAATTCCGGCAGAT CATCGATAAC GTTGACCGTA CTGCAGGAGA ATATGGTTGG 1080 AATGCCTTCTTCTTGGATAA CCACGACAAT CCGCGCGCTG TCTCGCACTT TGGCGATGAT 1140 CGCCCACAATGGCGTGAGCC ATCGGCTAAA GCGCTTGCAA CCTTGACGCT GACTCAACGA 1200 GCAACACCTTTTATTTATCA AGGTTCAGAA TTGGGCATGA CCAATTACCC GTTTAAAGCT 1260 ATTGATGAATTCGATGATAT TGAGGTGAAA GGTTTTTGGC ATGACTACGT TGAGACAGGA 1320 AAGGTCAAAGCCGACGAGTT CTTGCAAAAT GTACGCCTGA CGAGCAGGGA TAACAGCCGG 1380 ACGCCGTTCCAATGGGATGG GAGCAAAAAT GCAGGATTCA CGAGCGGAAA ACCTTGGTTC 1440 AAGGTCAACCCAAACTACCA GGAAATCAAT GCAGTAAGTC AAGTCACACA ACCCGACTCA 1500 GTATTTAACTATTATCGTCA GTTGATCAAG ATAAGGCATG ACATCCCGGC ACTGACCTAT 1560 GGTACATACACCGATTTGGA TCCTGCAAAT GATTCGGTCT ACGCCTATAC ACGCAGCCTT 1620 GGGGCGGAAAAATATCTTGT TGTTGTTAAC TTCAAGGAGC AAATGATGAG ATATAAATTA 1680 CCGGATAATTTATCCATTGA GAAAGTGATT ATAGACAGCA ACAGCAAAAA CGTGGTGAAA 1740 AAGAATGATTCATTACTCGA GCTAAAACCA TGGCAGTCAG GGGTTTATAA ACTAAATCAA 1800 TAA 1803 600amino acids amino acid single linear peptide 10 Met Pro Arg Gln Gly LeuLys Thr Ala Leu Ala Ile Phe Leu Thr Thr 1 5 10 15 Ser Leu Cys Ile SerCys Gln Gln Ala Phe Gly Thr Gln Gln Pro Leu 20 25 30 Leu Asn Glu Lys SerIle Glu Gln Ser Lys Thr Ile Pro Lys Trp Trp 35 40 45 Lys Glu Ala Val PheTyr Gln Val Tyr Pro Arg Ser Phe Lys Asp Thr 50 55 60 Asn Gly Asp Gly IleGly Asp Ile Asn Gly Ile Ile Glu Lys Leu Asp 65 70 75 80 Tyr Leu Lys AlaLeu Gly Ile Asp Ala Ile Trp Ile Asn Pro His Tyr 85 90 95 Asp Ser Pro AsnThr Asp Asn Gly Tyr Asp Ile Arg Asp Tyr Arg Lys 100 105 110 Ile Met LysGlu Tyr Gly Thr Met Glu Asp Phe Asp Arg Leu Ile Ser 115 120 125 Glu MetLys Lys Arg Asn Met Arg Leu Met Ile Asp Val Val Ile Asn 130 135 140 HisThr Ser Asp Gln Asn Glu Trp Phe Val Lys Ser Lys Ser Ser Lys 145 150 155160 Asp Asn Pro Tyr Arg Gly Tyr Tyr Phe Trp Lys Asp Ala Lys Glu Gly 165170 175 Gln Ala Pro Asn Asn Tyr Pro Ser Phe Phe Gly Gly Ser Ala Trp Gln180 185 190 Lys Asp Glu Lys Thr Asn Gln Tyr Tyr Leu His Tyr Phe Ala LysGln 195 200 205 Gln Pro Asp Leu Asn Trp Asp Asn Pro Lys Val Arg Gln AspLeu Tyr 210 215 220 Ala Met Leu Arg Phe Trp Leu Asp Lys Gly Val Ser GlyLeu Arg Phe 225 230 235 240 Asp Thr Val Ala Thr Tyr Ser Lys Ile Pro AspPhe Pro Asn Leu Thr 245 250 255 Gln Gln Gln Leu Lys Asn Phe Ala Ala GluTyr Thr Lys Gly Pro Asn 260 265 270 Ile His Arg Tyr Val Asn Glu Met AsnLys Glu Val Leu Ser His Tyr 275 280 285 Asp Ile Ala Thr Ala Gly Glu IlePhe Gly Val Pro Leu Asp Gln Ser 290 295 300 Ile Lys Phe Phe Asp Arg ArgArg Asp Glu Leu Asn Ile Ala Phe Thr 305 310 315 320 Phe Asp Leu Ile ArgLeu Asp Arg Asp Ser Asp Gln Arg Trp Arg Arg 325 330 335 Lys Asp Trp LysLeu Ser Gln Phe Arg Gln Ile Ile Asp Asn Val Asp 340 345 350 Arg Thr AlaGly Glu Tyr Gly Trp Asn Ala Phe Phe Leu Asp Asn His 355 360 365 Asp AsnPro Arg Ala Val Ser His Phe Gly Asp Asp Arg Pro Gln Trp 370 375 380 ArgGlu Pro Ser Ala Lys Ala Leu Ala Thr Leu Thr Leu Thr Gln Arg 385 390 395400 Ala Thr Pro Phe Ile Tyr Gln Gly Ser Glu Leu Gly Met Thr Asn Tyr 405410 415 Pro Phe Lys Ala Ile Asp Glu Phe Asp Asp Ile Glu Val Lys Gly Phe420 425 430 Trp His Asp Tyr Val Glu Thr Gly Lys Val Lys Ala Asp Glu PheLeu 435 440 445 Gln Asn Val Arg Leu Thr Ser Arg Asp Asn Ser Arg Thr ProPhe Gln 450 455 460 Trp Asp Gly Ser Lys Asn Ala Gly Phe Thr Ser Gly LysPro Trp Phe 465 470 475 480 Lys Val Asn Pro Asn Tyr Gln Glu Ile Asn AlaVal Ser Gln Val Thr 485 490 495 Gln Pro Asp Ser Val Phe Asn Tyr Tyr ArgGln Leu Ile Lys Ile Arg 500 505 510 His Asp Ile Pro Ala Leu Thr Tyr GlyThr Tyr Thr Asp Leu Asp Pro 515 520 525 Ala Asn Asp Ser Val Tyr Ala TyrThr Arg Ser Leu Gly Ala Glu Lys 530 535 540 Tyr Leu Val Val Val Asn PheLys Glu Gln Met Met Arg Tyr Lys Leu 545 550 555 560 Pro Asp Asn Leu SerIle Glu Lys Val Ile Ile Asp Ser Asn Ser Lys 565 570 575 Asn Val Val LysLys Asn Asp Ser Leu Leu Glu Leu Lys Pro Trp Gln 580 585 590 Ser Gly ValTyr Lys Leu Asn Gln 595 600 1794 base pairs nucleic acid single linearDNA (genomic) misc_RNA 810 /note= “D = Unknown” misc_RNA 1471 /note= “S= Unknown” 11 ATGTCTTTTG TTACGCTACG TACCGGGGTG GCTGTCGCGC TGTCATCTTTGATAATAAGT 60 CTGGCCTGCC CGGCTGTCAG TGCTGCACCA TCCTTGAATC AGGATATTCACGTTCAAAAG 120 GAAAGTGAAT ATCCTGCATG GTGGAAAGAA GCTGTTTTTT ATCAGATCTATCCTCGCTCA 180 TTTAAAGACA CCAATGATGA TGGCATTGGC GATATTCGCG GTATTATTGAAAAGCTGGAC 240 TATCTGAAAT CGCTCGGTAT TGACGCTATC TGGATCAATC CCCATTACGACTCTCCGAAC 300 ACCGATAACG GCTATGACAT CAGTAATTAT CGTCAGATAA TGAAAGAGTATGGCACAATG 360 GAGGATTTTG ATAGCCTTGT TGCCGAAATG AAAAAACGAA ATATGCGCTTAATGATCGAC 420 GTGGTCATTA ACCATACCAG TGATCAACAC CCGTGGTTTA TTCAGAGTAAAAGCGATAAA 480 AACAACCCTT ATCGTGACTA TTATTTCTGG CGTGACGGAA AAGATAATCAGCCACCTAAT 540 AATTACCCCT CATTTTTCGG CGGCTCGGCA TGGCAAAAAG ATGCAAAGTCAGGACAGTAC 600 TATTTACACT ATTTTGCCAG ACAGCAACCT GATCTCAACT GGGATAACCCGAAAGTACGT 660 GAGGATCTTT ACGCAATGCT CCGCTTCTGG CTGGATAAAG GCGTTTCAGGCATGCGATTT 720 GATACGGTGG CAACTTATTC CAAAATCCCG GGATTTCCCA ATCTGACACCTGAACAACAG 780 AAAAATTTTG CTGAACAATA CACCATGGGD CCTAATATTC ATCGATACATTCAGGAAATG 840 AACCGGAAAG TTCTGTCCCG GTATGATGTG GCCACCGCGG GTGAAATTTTTGGCGTCCCG 900 CTGGATCGTT CGTCGCAGTT TTTTGATCGC CGCCGACATG AGCTGAATATGGCGTTTATG 960 TTTGACCTCA TTCGTCTCGA TCGCGACAGC AATGAACGCT GGCGTCACAAGTCGTGGTCG 1020 CTCTCTCAGT TCCGCCAGAT CATCAGCAAA ATGGATGTCA CGGTCGGAAAGTATGGCTGG 1080 AACACGTTCT TCTTAGACAA CCATGACAAC CCCCGTGCGG TATCTCACTTCGGGGATGAC 1140 AGGCCGCAAT GGCGGGAGGC GTCGGCTAAG GCACTGGCGA CGATTACCCTCACTCAGCGG 1200 GCGACGCCGT TTATTTATCA GGGTTCAGAG CTGGGAATGA CTAATTATCCCTTCAGGCAA 1260 CTCAACGAAT TTGACGACAT CGAGGTCAAA GGTTTCTGGC AGGATTATGTCCAGAGTGGA 1320 AAAGTCACGG CCACAGAGTT TCTCGATAAT GTGCGCCTGA CGAGCCGCGATAACAGCAGA 1380 ACACCTTTCC AGTGGAATGA CACCCTGAAT GCTGGTTTTA CTCGCGGAAAGCCGTGGTTT 1440 CACATCAACC CAAACTATGT GGAGATCAAC SCCGAACGCG AAGAAACCCGCGAAGATTCA 1500 GTGCTGAATT ACTATAAAAA AATGATTCAG CTACGCCACC ATATCCCTGCTCTGGTATAT 1560 GGCGCCTATC AGGATCTTAA TCCACAGGAC AATACCGTTT ATGCCTATACCCGAACGCTG 1620 GGTAACGAGC GTTATCTGGT CGTGGTGAAC TTTAAGGAGT ACCCGGTCCGCTATACTCTC 1680 CCGGCTAATG ATGCCATCGA GGAAGTGGTC ATTGATACTC AGCAGCAAGGTGCGCCGCAC 1740 AGCACATCCC TGTCATTGAG CCCCTGGCAG GCAGGTGCGT ATAAGCTGCGGTAA 1794 597 amino acids amino acid single linear peptide Peptide 270/note= “X = Unknown” Peptide 491 /note= “X = Unknown” 12 Met Ser Phe ValThr Leu Arg Thr Gly Val Ala Val Ala Leu Ser Ser 1 5 10 15 Leu Ile IleSer Leu Ala Cys Pro Ala Val Ser Ala Ala Pro Ser Leu 20 25 30 Asn Gln AspIle His Val Gln Lys Glu Ser Glu Tyr Pro Ala Trp Trp 35 40 45 Lys Glu AlaVal Phe Tyr Gln Ile Tyr Pro Arg Ser Phe Lys Asp Thr 50 55 60 Asn Asp AspGly Ile Gly Asp Ile Arg Gly Ile Ile Glu Lys Leu Asp 65 70 75 80 Tyr LeuLys Ser Leu Gly Ile Asp Ala Ile Trp Ile Asn Pro His Tyr 85 90 95 Asp SerPro Asn Thr Asp Asn Gly Tyr Asp Ile Ser Asn Tyr Arg Gln 100 105 110 IleMet Lys Glu Tyr Gly Thr Met Glu Asp Phe Asp Ser Leu Val Ala 115 120 125Glu Met Lys Lys Arg Asn Met Arg Leu Met Ile Asp Val Val Ile Asn 130 135140 His Thr Ser Asp Gln His Pro Trp Phe Ile Gln Ser Lys Ser Asp Lys 145150 155 160 Asn Asn Pro Tyr Arg Asp Tyr Tyr Phe Trp Arg Asp Gly Lys AspAsn 165 170 175 Gln Pro Pro Asn Asn Tyr Pro Ser Phe Phe Gly Gly Ser AlaTrp Gln 180 185 190 Lys Asp Ala Lys Ser Gly Gln Tyr Tyr Leu His Tyr PheAla Arg Gln 195 200 205 Gln Pro Asp Leu Asn Trp Asp Asn Pro Lys Val ArgGlu Asp Leu Tyr 210 215 220 Ala Met Leu Arg Phe Trp Leu Asp Lys Gly ValSer Gly Met Arg Phe 225 230 235 240 Asp Thr Val Ala Thr Tyr Ser Lys IlePro Gly Phe Pro Asn Leu Thr 245 250 255 Pro Glu Gln Gln Lys Asn Phe AlaGlu Gln Tyr Thr Met Xaa Pro Asn 260 265 270 Ile His Arg Tyr Ile Gln GluMet Asn Arg Lys Val Leu Ser Arg Tyr 275 280 285 Asp Val Ala Thr Ala GlyGlu Ile Phe Gly Val Pro Leu Asp Arg Ser 290 295 300 Ser Gln Phe Phe AspArg Arg Arg His Glu Leu Asn Met Ala Phe Met 305 310 315 320 Phe Asp LeuIle Arg Leu Asp Arg Asp Ser Asn Glu Arg Trp Arg His 325 330 335 Lys SerTrp Ser Leu Ser Gln Phe Arg Gln Ile Ile Ser Lys Met Asp 340 345 350 ValThr Val Gly Lys Tyr Gly Trp Asn Thr Phe Phe Leu Asp Asn His 355 360 365Asp Asn Pro Arg Ala Val Ser His Phe Gly Asp Asp Arg Pro Gln Trp 370 375380 Arg Glu Ala Ser Ala Lys Ala Leu Ala Thr Ile Thr Leu Thr Gln Arg 385390 395 400 Ala Thr Pro Phe Ile Tyr Gln Gly Ser Glu Leu Gly Met Thr AsnTyr 405 410 415 Pro Phe Arg Gln Leu Asn Glu Phe Asp Asp Ile Glu Val LysGly Phe 420 425 430 Trp Gln Asp Tyr Val Gln Ser Gly Lys Val Thr Ala ThrGlu Phe Leu 435 440 445 Asp Asn Val Arg Leu Thr Ser Arg Asp Asn Ser ArgThr Pro Phe Gln 450 455 460 Trp Asn Asp Thr Leu Asn Ala Gly Phe Thr ArgGly Lys Pro Trp Phe 465 470 475 480 His Ile Asn Pro Asn Tyr Val Glu IleAsn Xaa Glu Arg Glu Glu Thr 485 490 495 Arg Glu Asp Ser Val Leu Asn TyrTyr Lys Lys Met Ile Gln Leu Arg 500 505 510 His His Ile Pro Ala Leu ValTyr Gly Ala Tyr Gln Asp Leu Asn Pro 515 520 525 Gln Asp Asn Thr Val TyrAla Tyr Thr Arg Thr Leu Gly Asn Glu Arg 530 535 540 Tyr Leu Val Val ValAsn Phe Lys Glu Tyr Pro Val Arg Tyr Thr Leu 545 550 555 560 Pro Ala AsnAsp Ala Ile Glu Glu Val Val Ile Asp Thr Gln Gln Gln 565 570 575 Gly AlaPro His Ser Thr Ser Leu Ser Leu Ser Pro Trp Gln Ala Gly 580 585 590 AlaTyr Lys Leu Arg 595 1782 base pairs nucleic acid single linear DNA(genomic) misc_RNA 1237..1331 /note= “N = Unknown” 13 ATGCTTATGAAGAGATTATT CGCCGCGTCT CTGATGCTTG CTTTTTCAAG CGTCTCCTCT 60 GTGAGGGCTGAGGAGGCCGT AAAGCCGGGC GCGCCATGGT GGAAAAGTGC TGTCTTCTAT 120 CAGGTCTATCCGCGCTCGTT CAAGGATACC AACGGTGATG GGATCGGCGA TTTCAAAGGA 180 CTGACGGAGAAGCTCGACTA TCTCAAGGGG CTCGGCATAG ACGCCATCTG GATCAATCCA 240 CATTACGCGTCTCCCAACAC CGATAATGGC TACGATATCA GCGACTATCG AGAGGTCATG 300 AAGGAATATGGGACGATGGA GGACTTCGAT CGTCTGATGG CTGAGTTGAA GAAGCGCGGC 360 ATGCGGCTCATGGTTGATGT CGTGATCAAC CATTCGAGTG ACCAACACGA ATGGTTCAAG 420 AGCAGCCGGGCCTCCAAAGA CAATCCCTAC CGTGACTATT ATTTCTGGCG TGACGGCAAA 480 GACGGTCACGAGCCAAACAA TTACCCTTCC TTCTTCGGCG GTTCGGCATG GGAGAAGGAC 540 CCCGTAACCGGGCAATATTA CCTGCATTAT TTCGGTCGTC AGCAGCCAGA TCTGAACTGG 600 GACACGCCGAAGCTTCGCGA GGAACTCTAT GCGATGCTGC GGTTCTGGCT CGACAAGGGC 660 GTATCAGGCATGCGGTTCGA TACGGTGGCT ACCTACTCGA AGACACCGGG TTTCCCGGAT 720 CTGACACCGGAGCAGATGAA GAACTTCGCG GAGGCCTATA CCCAGGGGCC GAACCTTCAT 780 CGTTACCTGCAGGAAATGCA CGAGAAGGTC TTCGATCATT ATGACGCGGT CACGGCCGGC 840 GAAATCTTCGGCGCTCCGCT CAATCAAGTG CCGCTGTTCA TCGACAGCCG GAGGAAAGAG 900 CTGGATATGGCTTTCACCTT CGATCTGATC CGTTATGATC GCGCACTGGA TCGTTGGCAT 960 ACCATTCCGCGTACCTTAGC GGACTTCCGT CAAACGATCG ATAAGGTCGA CGCCATCGCG 1020 GGCGAATATGGCTGGAACAC GTTCTTCCTC GGCAATCACG ACAATCCCCG TGCGGTATCG 1080 CATTTTGGTGACGATCGGCC GCAATGGCGC GAAGCCTCGG CCAAGGCTCT GGCCACCGTC 1140 ACCTTGACCCAGCGAGGAAC GCCGTTCATC TTCCAAGGAG ATGAACTCGG AATGACCAAC 1200 TACCCCTTCAAGACGCTGCA GGACTTTGAT GATATCNNNN NNNNNNNNNN NNNNNNNNNN 1260 NNNNNNNNNNNNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 1320 NNNNNNNNNNNTGTGGCGTT GACTAGCCGA GCAAACGCCC GCACGCCCTT TCAATGGGAT 1380 GACAGTGCTAATGCGGGATT CACAACTGGC AAGCCTTGGC TAAAGGTCAA TCCAAACTAC 1440 ACTGAGATCAACGCCGCGCG GGAAATTGGC GATCCTAAAT CGGTCTACAG CTTTTACCGC 1500 AACCTGATCTCAATCCGGCA TGAAACTCCC GCTCTTTCGA CCGGGAGCTA TCGCGACATC 1560 GATCCGAGTAATGCCGATGT CTATGCCTAT ACGCGCAGCC AGGATGGCGA GACCTATCTG 1620 GTCGTAGTCAACTTCAAGGC AGAGCCAAGG AGTTTCACGC TTCCGGACGG CATGCATATT 1680 GCCGAAACCCTGATTGAGAG CAGTTCGCCA GCAGCTCCGG CGGCGGGGGC TGCAAGCCTT 1740 GAGCTGCAGCCTTGGCAGTC CGGCATCTAC AAGGTGAAGT AA 1782 593 amino acids amino acidsingle linear peptide Peptide 413..444 /note= “Xaa = Unknown” 14 Met LeuMet Lys Arg Leu Phe Ala Ala Ser Leu Met Leu Ala Phe Ser 1 5 10 15 SerVal Ser Ser Val Arg Ala Glu Glu Ala Val Lys Pro Gly Ala Pro 20 25 30 TrpTrp Lys Ser Ala Val Phe Tyr Gln Val Tyr Pro Arg Ser Phe Lys 35 40 45 AspThr Asn Gly Asp Gly Ile Gly Asp Phe Lys Gly Leu Thr Glu Lys 50 55 60 LeuAsp Tyr Leu Lys Gly Leu Gly Ile Asp Ala Ile Trp Ile Asn Pro 65 70 75 80His Tyr Ala Ser Pro Asn Thr Asp Asn Gly Tyr Asp Ile Ser Asp Tyr 85 90 95Arg Glu Val Met Lys Glu Tyr Gly Thr Met Glu Asp Phe Asp Arg Leu 100 105110 Met Ala Glu Leu Lys Lys Arg Gly Met Arg Leu Met Val Asp Val Val 115120 125 Ile Asn His Ser Ser Asp Gln His Glu Trp Phe Lys Ser Ser Arg Ala130 135 140 Ser Lys Asp Asn Pro Tyr Arg Asp Tyr Tyr Phe Trp Arg Asp GlyLys 145 150 155 160 Asp Gly His Glu Pro Asn Asn Tyr Pro Ser Phe Phe GlyGly Ser Ala 165 170 175 Trp Glu Lys Asp Pro Val Thr Gly Gln Tyr Tyr LeuHis Tyr Phe Gly 180 185 190 Arg Gln Gln Pro Asp Leu Asn Trp Asp Thr ProLys Leu Arg Glu Glu 195 200 205 Leu Tyr Ala Met Leu Arg Phe Trp Leu AspLys Gly Val Ser Gly Met 210 215 220 Arg Phe Asp Thr Val Ala Thr Tyr SerLys Thr Pro Gly Phe Pro Asp 225 230 235 240 Leu Thr Pro Glu Gln Met LeuAsn Phe Ala Glu Ala Tyr Thr Gln Gly 245 250 255 Pro Asn Leu His Arg TyrLeu Gln Glu Met His Glu Lys Val Phe Asp 260 265 270 His Tyr Asp Ala ValThr Ala Gly Glu Ile Phe Gly Ala Pro Leu Asn 275 280 285 Gln Val Pro LeuPhe Ile Asp Ser Arg Arg Lys Glu Leu Asp Met Ala 290 295 300 Phe Thr PheAsp Leu Ile Arg Tyr Asp Arg Ala Leu Asp Arg Trp His 305 310 315 320 ThrIle Pro Arg Thr Leu Ala Asp Phe Arg Gln Thr Ile Asp Lys Val 325 330 335Asp Ala Ile Ala Gly Glu Tyr Gly Trp Asn Thr Phe Phe Leu Gly Asn 340 345350 His Asp Asn Pro Arg Ala Val Ser His Phe Gly Asp Asp Arg Pro Gln 355360 365 Trp Arg Glu Ala Ser Ala Lys Ala Leu Ala Thr Val Thr Leu Thr Gln370 375 380 Arg Gly Thr Pro Phe Ile Phe Gln Gly Asp Glu Leu Gly Met ThrAsn 385 390 395 400 Tyr Pro Phe Lys Thr Leu Gln Asp Phe Asp Asp Ile XaaXaa Xaa Xaa 405 410 415 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa XaaXaa Xaa Xaa Xaa 420 425 430 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa XaaXaa Val Ala Leu Thr 435 440 445 Ser Arg Ala Asn Ala Arg Thr Pro Phe GlnTrp Asp Asp Ser Ala Asn 450 455 460 Ala Gly Phe Thr Thr Gly Lys Pro TrpLeu Lys Val Asn Pro Asn Tyr 465 470 475 480 Thr Glu Ile Asn Ala Ala ArgGlu Ile Gly Asp Pro Lys Ser Val Tyr 485 490 495 Ser Phe Tyr Arg Asn LeuIle Ser Ile Arg His Glu Thr Pro Ala Leu 500 505 510 Ser Thr Gly Ser TyrArg Asp Ile Asp Pro Ser Asn Ala Asp Val Tyr 515 520 525 Ala Tyr Thr ArgSer Gln Asp Gly Glu Thr Tyr Leu Val Val Val Asn 530 535 540 Phe Lys AlaGlu Pro Arg Ser Phe Thr Leu Pro Asp Gly Met His Ile 545 550 555 560 AlaGlu Thr Leu Ile Glu Ser Ser Ser Pro Ala Ala Pro Ala Ala Gly 565 570 575Ala Ala Ser Leu Glu Leu Gln Pro Trp Gln Ser Gly Ile Tyr Lys Val 580 585590 Lys 1704 base pairs nucleic acid single linear DNA (genomic) 15ATGACTGAAA AGTTATCCTT CGAGTCGACA ACAATCTCGC GTCGCTGGTG GAAAGAGGCT 60GTTGTCTATC AGGTGTATCC CCGCTCGTTC CAGGATTCGA ACGGGGACGG CATCGGCGAC 120CTTCCGGGCA TAACTGCGAG GCTAGATTAC ATCCTCGGTC TAGGCGTTAG TGTCATCTGG 180CTCAGCCCCC ATTTCGACTC TCCGAATGCT GACAACGGCT ACGATATCCG TGACTATCGC 240AAGGTGATGC GCGAATTCGG CACCATGGCG GATTTCGATC ACCTGCTGGC CGAGACGAAA 300AAGCGCGGCA TGCGGCTGAT CATCGATCTC GTCGTCAACC ATACCAGCGA CGAGCATGTC 360TGGTTTGCCG AAAGCCGGGC CTCGAAAAAC AGCCCGTACC GTGATTACTA CATCTGGCAT 420CCCGGCCGGG ACGGCGCCGA GCCGAACGAC TGGCGCTCAT TTTTCTCGGG CTCGGCATGG 480ACTTTCGACC AGCCAACCGG CGAATACTAC ATGCATCTTT TCGCCGATAA ACAGCCGGAT 540ATCAACTGGG ACAATCCGGC TGTGCGCGCC GATGTCTATG ACATCATGCG CTTTTGGCTG 600GACAAGGGCG TCGACGGATT CCGCATGGAT GTCATCCCCT TCATCTCCAA GCAAGACGGC 660CTGCCCGACT ATCCTGACCA TCATCGCGGC GCGCCGCAGT TTTTCCACGG TTCGGGTCCC 720CGCTTGCACG ACTATCTTCA GGAAATGAAC CGCGAGGTAT TGTCGCATTA CGATGTGATG 780ACGGTTGGCG AGGCCTTCGG TGTGACGGCG GATGCGACGC CGCTTCTGGT CGACGAACGG 840CGCCGCGAAC TGAACATGAT CTTCAATTTC GACGCCGTGC GCATCGGCCG TGGCGAGACC 900TGGCACACTA AGCCTTGGGC CCTGCCGGAA CTTAAGGCGA TCTATGCCCG TCTGGACGCT 960GCGACCGACC AGCACTGCTG GGGTACGGTC TTTCTCTCCA ACCACGACAA TCCTCGTCTC 1020GTCTCCCGGT TCGGTGATGA TCATCCTGAC TGGCGGGTGG CGTCGGCCAA GGTTCTTGCC 1080ACACTTCTCC TAACGCTGAA GGGCACGCCT TTCATCTACC AAGGCGATGA ATTGGGCATG 1140ACCAACTATC CTCGGCTCGG TCGAGGAGAC GACGATATCG AGGTGCGCAA CGCCTGGCAG 1200GCTGAGGTCA TGACCGGTAA GGCGGATGCA GCCGAATTTC TCGGGGAGAT GCTGAAGATT 1260TCCCGCGATC ATTCCCGCAC ACCGATGCAA TGGGACGCCA GTCTCGACGG TGGTTTCACT 1320CGGGGTGAAA AGCCCTGGCT ATCGGTCAAT CCGAACTATC GGGCGATCAA TGCGGATGCG 1380GCACTCGCCG ATCCCGATTC GATCTACCAT TATTACGCCG CACTCATCCG TTTCCGGCGC 1440GAGACACCGG CGCTCATCTA CGGCGATTAT GACGACTTGG CGCCGGATCA TCCGCACCTC 1500TTCGTCTATA CAAGAACATT GGGGTCCGAG CGCTATCTGG TCGCGCTTAA CTTCTCCGGC 1560GATGCGCAGG CACTTGTTCT CCCGACAGAC CTGAGCGCCG CGTCACCTGT TATCGGGCGC 1620GCCCCGCAAG TGGACCGCAT GCAGCATGAT GCTGCACGGA TCGAGCTGAT GGGTTGGGAA 1680GCGCGGGTCT ACCACTGCGC ATGA 1704 567 amino acids amino acid single linearpeptide 16 Met Thr Glu Lys Leu Ser Phe Glu Ser Thr Thr Ile Ser Arg ArgTrp 1 5 10 15 Trp Lys Glu Ala Val Val Tyr Gln Val Tyr Pro Arg Ser PheGln Asp 20 25 30 Ser Asn Gly Asp Gly Ile Gly Asp Leu Pro Gly Ile Thr AlaArg Leu 35 40 45 Asp Tyr Ile Leu Gly Leu Gly Val Ser Val Ile Trp Leu SerPro His 50 55 60 Phe Asp Ser Pro Asn Ala Asp Asn Gly Tyr Asp Ile Arg AspTyr Arg 65 70 75 80 Lys Val Met Arg Glu Phe Gly Thr Met Ala Asp Phe AspHis Leu Leu 85 90 95 Ala Glu Thr Lys Lys Arg Gly Met Arg Leu Ile Ile AspLeu Val Val 100 105 110 Asn His Thr Ser Asp Glu His Val Trp Phe Ala GluSer Arg Ala Ser 115 120 125 Lys Asn Ser Pro Tyr Arg Asp Tyr Tyr Ile TrpHis Pro Gly Arg Asp 130 135 140 Gly Ala Glu Pro Asn Asp Trp Arg Ser PhePhe Ser Gly Ser Ala Trp 145 150 155 160 Thr Phe Asp Gln Pro Thr Gly GluTyr Tyr Met His Leu Phe Ala Asp 165 170 175 Lys Gln Pro Asp Ile Asn TrpAsp Asn Pro Ala Val Arg Ala Asp Val 180 185 190 Tyr Asp Ile Met Arg PheTrp Leu Asp Lys Gly Val Asp Gly Phe Arg 195 200 205 Met Asp Val Ile ProPhe Ile Ser Lys Gln Asp Gly Leu Pro Asp Tyr 210 215 220 Pro Asp His HisArg Gly Ala Pro Gln Phe Phe His Gly Ser Gly Pro 225 230 235 240 Arg LeuHis Asp Tyr Leu Gln Glu Met Asn Arg Glu Val Leu Ser His 245 250 255 TyrAsp Val Met Thr Val Gly Glu Ala Phe Gly Val Thr Ala Asp Ala 260 265 270Thr Pro Leu Leu Val Asp Glu Arg Arg Arg Glu Leu Asn Met Ile Phe 275 280285 Asn Phe Asp Ala Val Arg Ile Gly Arg Gly Glu Thr Trp His Thr Lys 290295 300 Pro Trp Ala Leu Pro Glu Leu Lys Ala Ile Tyr Ala Arg Leu Asp Ala305 310 315 320 Ala Thr Asp Gln His Cys Trp Gly Thr Val Phe Leu Ser AsnHis Asp 325 330 335 Asn Pro Arg Leu Val Ser Arg Phe Gly Asp Asp His ProAsp Trp Arg 340 345 350 Val Ala Ser Ala Lys Val Leu Ala Thr Leu Leu LeuThr Leu Lys Gly 355 360 365 Thr Pro Phe Ile Tyr Gln Gly Asp Glu Leu GlyMet Thr Asn Tyr Pro 370 375 380 Arg Leu Gly Arg Gly Asp Asp Asp Ile GluVal Arg Asn Ala Trp Gln 385 390 395 400 Ala Glu Val Met Thr Gly Lys AlaAsp Ala Ala Glu Phe Lys Gly Glu 405 410 415 Met Leu Lys Ile Ser Arg AspHis Ser Arg Thr Pro Met Gln Trp Asp 420 425 430 Ala Ser Leu Asp Gly GlyPhe Thr Arg Gly Glu Lys Pro Trp Leu Ser 435 440 445 Val Asn Pro Asn TyrArg Ala Ile Asn Ala Asp Ala Ala Leu Ala Asp 450 455 460 Pro Asp Ser IleTyr His Tyr Tyr Ala Ala Leu Ile Arg Phe Arg Arg 465 470 475 480 Glu ThrPro Ala Leu Ile Tyr Gly Asp Tyr Asp Asp Leu Ala Pro Asp 485 490 495 HisPro His Leu Phe Val Tyr Thr Arg Thr Leu Gly Ser Glu Arg Tyr 500 505 510Leu Val Ala Leu Asn Phe Ser Gly Asp Ala Gln Ala Leu Val Leu Pro 515 520525 Thr Asp Leu Ser Ala Ala Ser Pro Val Ile Gly Arg Ala Pro Gln Val 530535 540 Asp Arg Met Gln His Asp Ala Ala Arg Ile Glu Leu Met Gly Trp Glu545 550 555 560 Ala Arg Val Tyr His Cys Ala 565 17 base pairs nucleicacid single linear DNA (geonomic) 17 TGGTGGAARG ARGCTGT 17 20 base pairsnucleic acid single linear DNA (geonomic) 18 TCCCAGTTCA GRTCCGGCTG 20 19base pairs nucleic acid single linear DNA (geonomic) 19 AAAGATGGCGKCGAAAAGA 19 17 base pairs nucleic acid single linear DNA (geonomic) 20TGGAATGCCT TYTTCTT 17 23 base pairs nucleic acid single linear DNA(geonomic) 21 ATCCCGAAGT GGTGGAAGGA GGC 23 25 base pairs nucleic acidsingle linear DNA (geonomic) 22 CGGAATTCTT ATGCCCCGTC AAGGA 25 17 basepairs nucleic acid single linear DNA (geonomic) 23 TGGTGGAAAG AAGCTGT 1720 base pairs nucleic acid single linear DNA (geonomic) 24 TCCCAGTTCAGGTCCGGCTG 20 14 base pairs nucleic acid single linear DNA (geonomic) 25CARTTYGGYT AYGG 14 17 base pairs nucleic acid single linear DNA(geonomic) 26 GTTTTCCCAG TCACGAC 17

Applicants claim:
 1. An isolated or purified protein with sucroseisomerase activity, wherein the protein is recombinant and is encoded bya DNA sequence comprising (a) A nucleotide sequence selected from thegroup consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, and any of these sequences without thesignal peptide-coding region; (b) a nucleotide sequence corresponding tothe sequences from (a) within the scope of the degeneracy of the geneticcode, or (c) a nucleotide sequence that hybridizes with a sequence from(a), (b), or both (a) and (b), wherein a positive hybridization signalis still observed after washing with 1×SSC and 0.1% SDS at 55° C. forone hour.
 2. An isolated or purified protein as claimed in claim 1,wherein the protein is recombinant and comprises (a) an amino acidsequence selected from the group consisting of SEQ ID NO: 4, SEQ ID NO:5, SEQ ID NO; 6, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, and any ofthese sequences without the signal peptide region; or (b) an amino acidsequence that it is at least 80% homologous with the sequences from (a).3. An isolated or purified protein as claimed in claim 1, wherein theprotein is recombinant and has an amino acid sequence that is at least90% homologous to the amino acid sequences from (a) amino acid 51-149,(b) amino acid 168-181, (c) amino acid 199-250, (d) amino acid 351-387,or (e) amino acid 390-420 of the amino acid sequence shown in SEQ ID NO:4.
 4. A method for isolating nucleic acids that code for a protein witha sucrose isomerase activity comprising (a) preparing a gene bank from adonor organism that contains a DNA sequence coding for a protein with asucrose isomerase activity in a suitable host organism, (b) screeningthe clones of the gene bank, and (c) isolating the clones which containa nucleic acid coding for a protein with sucrose isomerase activity. 5.A method as claimed in claim 4, wherein E. coli is used as hostorganism.
 6. A method as claimed in claim 4, wherein the steps ofpreparing a gene bank, screening the clones, and isolating the clonesare performed in an E. coli strain that does not utilize galactose.
 7. Amethod as claimed in claim 4, wherein the clones in the gene bank arescreened using nucleic acid probes that are derived from the sequencesshown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 9, SEQ IDNO: 11, or SEQ ID NO:
 13. 8. A method as claimed in claim 7, wherein aDNA fragment which has been obtained by PCR amplification of the DNAfrom the donor organism using the oligonucleotide mixtures 5′-TGGTGGM(A,G) GA (A,G) GCTGT-3′ (SEQ ID NO: 17) and 5′-TCCCAGTTCAG (A,G)TCCGGCTG-3′ (SEQ ID NO: 18) as primers is used as nucleic acid probe. 9.Protein with palatinase activity, trehalulase activity, or both, that isencoded by a DNA sequence comprising (a) one of the nucleotide sequencesshown in SEQ ID NO: 7 or SEQ ID NO: 15, (b) a nucleotide sequencecharacterized in that it corresponds to the sequence from (a) within thescope of the degeneracy of the genetic code, or (c) a nucleotidesequence characterized in that it hybridizes with the sequences from(a), (b), or both (a) and (b).
 10. The protein as claimed in claim 9,comprising the amino acid sequence shown in SEQ ID NO: 8 or SEQ ID NO:16.